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Introduction 


any people in the microprocessor industry think that the assembly 
Mnhianzuaze programming community is divided into two groups: the 
sixers and the eighters. The "sixers" are people who like the CPU chips that 
start with the number 6—the 6500, 6800 and the 68000 families. The 
“eighters" like the chips starting with 8—the 8080, 8086, 8088, Z-80, 80286 
and 80386. The sixers support instruction sets that are very general, with lots 
of addressing modes available and a lot of identical CPU registers that each 
support all of the addressing modes. Typical sixer machines use memory- 
mapped I/O and support a large, flat memory space. Conversely, the eighters 
seem to like CPUs with a few very specialized registers and a lot of powerful 
special-purpose instructions. Typical eighter machines use special instruc- 
tions for I/O and support a segmented memory architecture. 

There seems to be a deep-seated philosophical difference between the two 
groups, and great wars of words have taken place between the various factions 
of the sixers and the eighters. The pages of Dr. Dobb’s Journal of Software 
Tools (DDJ) have long been a battleground for those wars. In this book, 
which contains articles originally printed in DDJ, along with some new 
material, we present information of interest to one of the most active factions 
in the Great Microprocessor Debates: the sixer folks who are fans of 
Motorola's 68000 family. 

I'm an unabashed sixer from way back. I cut my microprocessor teeth on 
the 6502, and spent three years at Atari programming video games on the 
6507 in the VCS game machine. When Motorola first introduced the 68000, I 
was delighted. Here at last was a machine with a huge address space, nice 
clean memory-mapped I/O, a lot of general-purpose registers, and an instruc- 
tion set that applied equally (well, almost) to all the registers. What's more, 
the future seemed to promise more powerful chips in the same family with 
nearly identical instruction sets. A sixer's dream! 

Since then my love for the 68000 family has grown steadily. Apple's Lisa 
and then the Macintosh started what was to become a whole wave of 68000 
machines. Now we have the Amiga (what a glorious computer!), the Atari ST 
and a whole plethora of systems based on the VME bus, one of the most 
sensible and well-supported bus standards I've ever seen. The 68020, the latest 
chip in the family, has proven to be a real powerhouse. Combined with the 
68881 floating-point coprocessor and the 68851 memory management unit, a 
68020 system can easily outperform many of today's minicomputers and 
most of the mainframes of the 1970s. 


Nicholas Turner 
Technical Editor 
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MC68000 Family 
History, Design 
and Philosophy 


Daniel Appleman 


In this introduction to the MC68000 family of 
microprocessor and support chips, Daniel Appleman 
describes its history, summarizes the features of the 
processors and examines the important differences between 
the 68000 family and the Intel family of processors. 


otorola's 68000 family, once little more than descriptive literature, has 
Mv inatured into a full selection of microprocessors and support chips. The 
original MC68000 has been followed by the 68008, 68010 and 68020, not to 
mention dozens of support and peripheral chips. Although these 
microprocessors differ in speed, memory addressing range and other details, 
they are based on common operating principles. These principles emerged in 
the late 1970s, with Motorola's introduction of the original 68000 
microprocessor. 

Gaining an understanding of the philosophy behind the 68000 family can 
shorten the learning time required to become an expert 68000 programmer. 
Whether or not you have had 68000 experience, the background provided in 
this chapter will help you understand the applications and examples in the rest 
of the book. 


Of Space, Speed and Support 

In 1979 the personal computer revolution was just beginning. Businesses 
were discovering that minicomputers could make some of their operations 
more efficient, but computer-aided design and large business applications were 
still confined to mainframes. 

In the world of small computers, the 8-bit microprocessor was king. 
Because they were well-suited for single-user, small- and moderate-size 
applications, 8-bit processors were the most common processors in home 
computers. Their limitations—64-kilobyte address space, poor support for 
high-level languages and poor multiuser support—were largely irrelevant in 
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the home market. Although these machines were initially successful in 
certain business applications, several factors soon sped them on their way to 
obsolescence in the business world. 

The most obvious factor was the rapid advance of technology. Integrated 
circuit (IC) manufacturers were approaching VLSI (very large scale 
integration) technology—the ability to reliably place more than 100,000 
transistors in a single IC. By itself, this ability would probably not have 
assured the development of 16-bit microprocessors. The big push came 
because semiconductor manufacturers identified a large potential market for 
the 16-bit chip—namely, the same businesses and laboratories that were 
using minicomputers and mainframes. The 8-bit processors could not 
compete with minicomputers in those areas; not only did the 8-bit machines 
suffer from the limitations already mentioned, but they were also too slow. 

A 16-bit processor is not simply twice as fast as an 8-bit machine. 
Determining which computer would be faster in a certain application, a 
process called benchmarking, requires a fairly complex examination of cycle 
times, types of instructions and the actual program in question. For the 
purpose of demonstration, consider the example of adding one 16-bit word to 
another in memory. Say there are 4 bytes of memory called A, B, C and D. 
We'd like to add the 2-byte integer in A and B to the 2-byte integer stored in 
C and D and put the result in C and D. 

On an 8-bit microprocessor like the 6800, the code to accomplish the task 
would look something like this: 


ldaa D ;Move D to Accumulator 

adda B ;Add B to Acc 

staa D 7;Store Acc to D 

ldaaA 7;Move A to Acc 

adcaC ;Add C (with carry from D+B) 
staacC ;Store Acc to C 


This code fragment would take 4+4+5+4+4+5=26 clock cycles to execute, 
which is the same as 13 microseconds if we assume a 2 megahertz (MHz) 
clock. 

On a 68000 chip, the code would look like this: 


mov.w A,d0 ;Move A and B to register d0 
add.w dao,c ;Add to C and D and store result 


This requires 12+16=28 clock cycles, which means 3.5 microseconds if we 
assume an 8 MHz clock. The newer 68000 processors run at an even faster 
12.5 MHz, which would allow the above operation to take place in 2.24 
microseconds. 

Differences in execution speed become even more profound when 
comparing more complex operations, such as 32-bit addition, multiplication 
and division. Although 8-bit machines will be around for years in low-end 
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computing and many device-control applications, the 16-bit processor is 
taking over for high-end applications. 

Unfortunately, increased speed is not the only consideration in designing a 
16-bit microprocessor; the same applications that form the market for 16-bit 
chips also demand minicomputer-like features and architecture. Fortunately, 
the chip designers kept this fact in mind, and the results can be seen in most 
high-end microprocessors. The implementations vary from company to 
company, but the goals are largely the same for all. The ideal 16-bit 
microcomputer: 

* offers a clear upgrade path to 32-bit processors and advanced super- 
minicomputer architectures; 

* provides good support for high-level languages and advanced multitasking 
operating systems; 

* easily interfaces with a wide variety of peripheral chips and supports 
multiprocessor applications. 


A Clear Upgrade Path 


In the late 1970s, prospective manufacturers of advanced microprocessors 
sent marketing representatives to virtually any company that expressed 
interest in these products. Computer design engineers had to choose from 
among the various upcoming microprocessors without the benefit of having 
used them. And the technical considerations were not the only problems 
—engineers also had to consider the long-term future of each microprocessor. 
Would the company be around in five or ten years? Would it provide the 
advanced development tools needed to work with these devices? Would there 
be multiple sources for the product? Would future chips in the family be 
compatible with the current chip? 

The last question was perhaps the most important of all. If a new 
microprocessor in a family was not hardware-compatible with the original, 
each new device—as well as its peripherals—would have to be redesigned to 
be incorporated into a system. (It is therefore no surprise that Motorola and 
Intel, the two main chip contenders, each specified its own system bus stan- 
dards. The VME bus, specified by several manufacturers, including Motorola, 
supports the 68000 family; the Multibus II from Intel is closely tied to the 
8086 family.) 

Another issue raised during the design of the new processors asked to what 
degree the new 16-bit devices should be compatible with the existing 8-bit 
devices. Because several years would pass before there was a large family of 
16-bit peripherals, all of the 16-bit processor manufacturers designed 
machines that supported 8-bit peripheral chips. 

In designing the new 16-bit devices, Motorola and Intel took entirely 
different approaches to software compatibility with the 8-bit world. Intel 
attempted to expand the existing 8080 family (registers on the 8086 can be 
accessed either as 16-bit registers or as dual 8-bit registers), hoping that the 
ease of converting 8-bit software to the 8086 would win many new 
microcomputer designs. Motorola took a calculated risk. Since the new 
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applications for the 16-bit devices were likely to be in markets that were not 
currently supported on 8-bit machines, the company decided to develop a 
software instruction set from scratch. 

The results of those decisions have been mixed—the market seems to have 
split into two clearly divided factions. While a vast array of software has been 
written for the very successful IBM PC and the PC-compatible "clones" (the 
premier 8086 chipset machines), there have been very few truly new designs 
using ‘fhe Intel chipset. At the same time, many new machines that use the 
68000 chipset have emerged. Apple's Macintosh, the Commodore Amiga, the 
Atari ST, the Sun Microsystems and Apollo work stations and many others 
fall into this category. The supporters of the 68000 family tend to be quite 
fanatical. Why? Typically, the response from a programmer refers to the 
clean, uniform design of the chips in the family and to the ease of 
programming. Because the 68000 was designed from scratch, no compromises 
were necessary to ensure downward compatibility. At the same time, the 
family was designed with upward compatibility in mind, so that the family's 
later, more powerful chips would not be crippled by the necessary compati- 
bility with earlier models. 


Supporting High-Level Languages and 
Advanced Operating Systems 

While hardware technology was advancing to the point at which 16-bit 
computers could be implemented on a single chip, software technology was 
also making strides. Newer languages such as C and Pascal were gaining in 
popularity for general programming, and specialized languages—APL, Lisp, 
Prolog and Forth, for example—were becoming more common. Most 8-bit 
machines, however, could only be programmed in machine language, BASIC 
and, occasionally, Forth and Pascal. The 8-bit 6500 series processors had only 
256 bytes in their stack, hardly enough memory to support the sophisticated 
parameter-passing and local-variable capability provided by most high-level 
languages. 

Why did high-level languages become so popular? As software develop- 
ment costs became significantly greater than hardware costs due to the in- 
creased sophistication of the applications demanded by the marketplace, any 
tools that enabled a programmer to be more efficient (generate more code in a 
shorter time) were quickly embraced by the computer industry and most 
programmers. Assembly language programming was relegated to extremely 
size- or speed-critical applications. Most major applications were written in 
the newer languages or in combinations of high- and low-level languages. 
With the advent of C, even operating systems could be written efficiently in a 
high-level language (Unix is the primary example). 

The increasingly sophisticated applications required increasingly 
sophisticated operating systems to support them. New word processing 
programs demanded fast updates and the ability to handle enormous 
documents. Advanced databases, which had previously been restricted to large 
mainframes, were being rewritten for small computers. These applications 
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demanded more memory and more speed than most 8-bit machines could 
provide. Furthermore, 8-bit machines are almost exclusively single-tasking 
systems meant for one user; sophisticated operating systems must support 
multiple users and multiple tasks. 

Multiuser systems themselves introduce another demand that 8-bit 
machines can't handle: protection schemes. Imagine a system being used by 
two people—the first is running a crucial accounting application nd the 
second is debugging a machine language program. In the course of this 
debugging, the second user accidentally runs the microprocessor's HALT in- 
struction. In an unprotected system, the first user will probably lose all of his 
or her work. None of the major 8-bit processors implements a security 
scheme that protects against accidental or intentional operations that can stop 
the system or scramble data. 

The new machines, however, support large address spaces and virtual 
memory schemes, and have instruction sets designed to run high-level 
languages, complex applications and operating systems efficiently. (See 
Chapter 2.) 


Interfacing With Peripherals and 
Supporting Multiprocessors 

Most microcomputer systems in the late 1970s were based on the original 
work by John Von Neumann. In the Von Neumann architecture, a central 
processing unit (CPU) was connected to peripheral chips and memory by way 
of a main computer bus. (See Figure 1.1.) 


Computer Bus 


FIGURE 1.1 The Von Neumann architecture. 
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The processing speed possible with such a system is limited by the 
amount of data that can be transferred over the main computer bus (this 
limitation is often referred to as the "Von Neumann bottleneck"). A simple 
technique for improving system performance is to make the peripheral device 
intelligent by adding a microprocessor. This improvement allows some of the 
I/O processing to take place in the peripheral device or chip, thus freeing the 
main processor (and the bus) from this task. Another technique is to have a 
direct memory access (DMA) device take over the bus from the main CPU 
and transfer data at a much higher rate than was possible with the CPU alone. 
Both of these techniques are quite common on 8-bit systems. 

A more sophisticated technique which is becoming more common is 
multiprocessing. At the simplest level, two or more CPUs share a bus with a 
bus arbitration mechanism. In more sophisticated schemes, each CPU has a 
local bus and local resources which can then tie in together to one or more 
main system buses. (See Figure 1.2.) Because multiple CPUs execute 
simultaneously, more work gets done in the same amount of time and the 
effective processing speed is greater. 


Features of the 68000 Family 


The original 68000 processor family provided by Motorola had four main 
members (others have since been added for specialized applications). As Table 
1.1 shows, the 68000, 68008 and 68010 are almost the same chip. 

The 68008 is identical to the 68000 except for the size of the data bus and 
address bus. It was designed for applications in which the system could use 
the power of a 16-bit machine, but could not justify the expense of the 16-bit 
support hardware. The 68010 is an improved version of the 68000 that 
supports virtual memory. From a programming point of view, the two chips 
are almost identical. The 68020, on the other hand, incorporates major 
improvements in both the hardware and the instruction set. And it maintains 
full compatibility with the other chips and will run their object code directly. 
Figure 1.3 shows the major structural differences among the processors 
within the family. 

The 68000 doesn't look like a 16-bit processor. True, it has a 16-bit data 
bus and a 16-bit arithmetic logic unit (ALU), but all of the registers are 32 
bits wide. This increased register size is one of the most important ways in 


68000 68008 68010 68020 
Registers 17 17 20 23 
ALU Width 16 16 16 32 
Address Bus 24 bits 20 bits 24 bits 32 bits 
Address Space 16 Mb 1 Mb 16 Mb 4 gigabytes 
Data Bus 16 bits 8 bits 16 bits 8/16/32 bits 
Data Registers 32 bits 32 bits 32 bits 32 bits 


TABLE 1.1 The 68000 family. 
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68000 


Data Registers 


Address Registers 


| Condition Code Register (CCR) 
Supervisor (Interrupt) 
Stack Pointer (ISP) 
68010 . 
EXTENSIONS | orate Register 2h) 


Alternate Function (SFC) 


Code Registers (DFC) 


68020 : 
EXTENSIONS Master Stack Pointer (MSP) 


Cache Control Register (CACR) 
Cache Address Register (CAAR) 


| 0 
. « Cache Memory 


| J 63 


FIGURE 1.3 68000 family architecture. 
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which Motorola provides a clear upgrade path. Programs written for the 
68000 family take full advantage of 32-bit operations and will run on true 32- 
bit machines. This upward compatibility is an enormously powerful concept, 
especially when contrasted with the approach that Intel took with its 8086 
family. 

The 8086 family supports compatibility by crippling its high-end proces- 
sors. In other words, the 80286 processor runs 8086 programs by disabling 
many of the 80286 features. Each generation in Intel's processor family 
advances by "gluing" more features onto the new chip. Motorola, however, 
designed a full 32-bit architecture from the beginning, kept that structure as 
the programming model and implemented it on 16-bit machines. As a result, 
most programs written for the 68000 family run equally well, with no 
modifications, on all members of the family. 


Address Space 

The 68000 family supports a full 32-bit (4-gigabyte) address space. Only 
the 68020 brings all of the address lines out of the chip package, but the 16 
megabytes supported by the 68000 is quite respectable. In addition to support- 
ing large applications and multitasking operating systems, a large address 
space is important for its ability to handle large amounts of data. For ex- 
ample, an image-processing application may deal with high-resolution 
graphics frames. A common screen size is 1024-by-1024 bits. With 256 
colors, this screen would require a whole megabyte by itself. Since such 
applications commonly require at least two frames to be present in memory at 
once, it is clear that 16 megabytes, while generous, is not unreasonable. 


Virtual Memory 

In the ideal system, a programmer would never have to worry about run- 
ning out of memory. It is not merely annoying to try to write large pro- 
grams with limited memory; the time and effort spent managing memory is 
overhead that does not directly improve the application. Few microcomputers 
have more than a couple of megabytes of real memory, but by supporting 
advanced memory management techniques and virtual memory, it is possible 
for programmers to work as if vast amounts of memory were actually present. 

How can the memory management unit (MMU) be used to help imple- 
ment a virtual-memory system? Here is a brief description of the principle 
involved. (Chapter 3 discusses this issue in greater detail.) 

Let's say you want a system to have 10 megabytes of memory, which can 
cost several thousand dollars. A less expensive solution might be to provide 
256K of high-speed RAM and a 10-megabyte disk drive. The 10-megabyte 
address space is actually stored on disk and divided into pages (in this case, 
we'll use 1K pages). Figure 1.4 shows a block diagram of such a system. 

Pages are copied into high-speed RAM as needed. A page table keeps track 
of which pages are in RAM and which of these have been modified and thus 
need to be written to disk. The behavior of such a system is perhaps best 
understood by looking at a sample algorithm used to read and write data. 
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Memory 10 Megabytes 
Management (Mapped 
Unit (MMU) onto Disk) 


Page Tabl 
ee hs 10K Pages, 


(10K Entries, 
1 for Each Page) 1 Kilobyte Each 


FIGURE 1.4 Virtual-memory system. 
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FIGURE 1.5 Synchronous bus. Device must be ready at end 
of T3 or insert wait states. 
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Procedure Readaword 
Check addressed page in page table 
If page is in RAM then read word from RAM 
else begin 
suspend the current instruction 
call subroutine to read page from disk 
read word from RAM 
continue instruction 
end 
end of procedure 


For writing a word, substitute write for read in the above algorithm. 


Procedure Read page from disk 
Check the page table for a free page of RAM 
If available then begin 
read the page from the disk 
update the page table 
end 
else begin 
select a page in RAM to be cleared 
if it has been modified then write it to disk 
read the new page from disk 
update the page table 
end 
end of procedure 


For virtual-memory systems to work reasonably quickly, the page table 
lookup and decision of whether a disk access is necessary must be done in 
hardware. This is one of the jobs of the MMU. It must also be possible for 
the processor to stop in the middle of an instruction when the page fault 
occurs, execute a routine to read or write a page to or from disk, then continue 
the instruction. This procedure is not quite the same as stopping between 
instructions (which is possible with any processor). The 68010 chip is the 
first in the family that supports instruction continuation—the original 68000 
does not. 

A subtle benefit of virtual-memory systems is that a power failure or 
system crash won't necessarily cause the loss of all of the information in the 
memory address space. A properly designed operating system could permit 
almost complete recovery, because the disk paging area would contain a 
nearly up-to-date copy of the RAM. 


Synchronous vs. Asynchronous Bus 

This issue, crucial to systems architects and hardware designers, is another 
area in which the Intel and Motorola families differ radically. With a 
synchronous bus (used by Intel), every system event is referenced to a single 
system clock. Figure 1.5 diagrams a read operation on the 8086 bus. 

On an asynchronous bus like the VME bus, which is used for many 68000 
systems, events do not need to be referenced to the system clock. Transfers are 
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made through a process called handshaking: a device requests a transfer and the 
resource acknowledges the transfer when complete. Figure 1.6 shows a read 
operation on the 68000 bus. 

In small- to moderate-size systems, synchronous buses are typically easier 
to implement than asynchronous ones. The major problem with a 
synchronous bus is that to increase its speed you must increase the speed of 
every device on it. If a device cannot respond fast enough for the higher 
system speed, it must be modified to insert wait-state delays when it is 
accessed (thus reducing the advantage of increased bus speed). A limited 
number of devices can be attached to a bus segment without additional 
buffers, which add more delays. Supporting devices of various speeds can be 
difficult because it requires different numbers of wait cycles. 

On an asynchronous bus, each device is responsible for acknowledging 
when it is ready. Speeding up the bus is a simple matter of increasing the 
system clock speed. Since the CPU synchronizes device responses by waiting 
for the "ready" signal, internal delays are generated automatically. Thus, the 
asynchronous bus is preferable in large systems and in systems that require 
flexibility or varied-speed devices. 


Privilege and Traps 

The 68000 family implements security by providing two levels of priv- 
ilege. The following instructions are available only in supervisor mode: 

* modification of the status register 

¢ move to or from the user stack pointer 

¢ move to or from control register or alternate address space 

* reset external devices 

* return from exception 

* stop program (halt the processor) 

Almost all applications of 16-bit microprocessors and most applications of 
8-bit processors use some form of operating system (OS). There must be a 
mechanism by which a program can transfer execution to an operating system 
subroutine. The mechanism for this transfer can vary. In CP/M systems, a 
fixed address contains the address of a jump table of the various OS subrou- 
tines. This mechanism was fairly common on 8-bit processors, mainly be- 
cause many of them could not implement any other method. 

The fixed-address method has one major problem. Since one of the 
requirements of a secure system is preventing users from accessing or 
modifying memory that does not belong to them, implementing such security 
implies that no such fixed address or jump table can be available to users. 

A better approach is to use traps. A trap is similar to an interrupt except 
that it is caused by an application. The 68000 provides for 16 trap vectors. 
When the processor executes a trap instruction, it causes a subroutine to call 
the specified trap vector. In addition, the processor exits user mode and enters 
the supervisor (privileged) mode, allowing the operating system access to all 
of the systems resources. (See Chapter 2.) 
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In addition to this basic use of traps, the 68010 and 68020 each contain a 
vector base register, which specifies the location of the exception vector table. 
The ability to change the location of the trap vector table can be useful when 
using the system to debug a new operating system. The master OS will have 
one set of exception vectors and the new OS will have a completely different 
one, which normally includes routines to emulate resources that do not yet 
exist. The vector base register considerably reduces the overhead required to 
switch between the master and target systems. 


Exceptions 

In addition to the 16 user trap vectors mentioned, the 68000 family 
supports 255 exception vectors. Table 1.2 describes some of the most 
commonly encountered exception vectors. (Note: This list refers to the 
68020; some of the vectors are not used on the 68000 and 68010.) 


Multilevel Interrupts 

External interrupts are common in the 8-bit world, but most 8-bit systems 
are limited to two levels of interrupts, often labeled "interrupt" and "non- 
maskable interrupt.” Implementing additional levels is not very difficult— 
many interrupt controller chips are available. Nevertheless, building into the 
CPU chip the ability to handle seven levels allows simple systems to be 
built with minimal hardware. Supporting multiple interrupt priorities is 
especially important in sophisticated multitasking systems. Lower-level inter- 
Tupts are commonly used for switching tasks and terminal I/O; mid-level 
interrupts can be used for system clocks, tapes and diskettes; high-level inter- 
rupts are needed for high-speed communications and disk I/O. If all the inter- 
Tupts were at the same level, a relatively low-priority interrupt (the time-of- 
day clock, for example) could interfere with a time-critical interrupt, such as 
input over a high-speed data port, causing loss of data. 


Address Strobe \ / 


Address 


DTACK \ / 


FIGURE 1.6 Asynchronous bus. DTACK is the acknowledge 
signal from the device. 
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Vector Description 
0-1 Reset vector—Initial PC and SP 
2 Bus error—Used also for page faults 
4 Illegal instruction 
5 Divide by zero 
6-7 Chk and TrapV (See Chapter 2) 
8 Privilege violation 
te) Trace 
10-11 For emulation of unimplemented instructions 
13 Coprocessor protocol violation 
25-31 Levels of external interrupts 
32-47 User traps 
48-54 Floating-point coprocessor vectors 
56-58 Memory management coprocessor vectors 
64-255 User-defined 


TABLE 1.2 Some 68020 exception vectors. 


Bus Arbitration 

The 68000 family contains control lines that support the arbitration 
between multiple bus masters with minimal external hardware. These lines 
can also be used with a bus arbitration chip to support complex architectures 
consisting of multiple local and system buses. Provision is also made for 
locking the bus—preventing another processor from obtaining access. This 
step is crucial for providing indivisible operations for semaphores. (See 
Chapter 2.) 


Fault Tolerance 

The 68000 family enables designers to create systems remarkably tolerant 
of errors. Depending on the response from the device, the 68000 will retry a 
bus cycle or signal a bus error exception which can then be used by the 
operating system to recover from the error. This is the mechanism used to 
generate page faults in virtual memory systems. Only a bus error during the 
exception processing for another bus error (double bus fault) will cause the 
processor to come to a complete halt. 


Dynamic Bus Sizing 

The 68000 can support older 8-bit peripherals. The 68020 takes this 
support a step further with a concept called dynamic bus sizing. The 68020 
can directly access 8-bit, 16-bit and 32-bit devices, adjusting to bus size used 
on a cycle-by-cycle basis according to each device's response. It will 
automatically make multiple transfers when necessary. In other words, the 
programmer can write a full 32-bit word to a port and if that port is only 8 
bits wide, the CPU will automatically make four separate 8-bit transfers. 
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Summary 

The 68000 family of microprocessors is a strong line of intercompatible 
devices. From its inception, the family was designed with the capacity of the 
high-end members in mind. Because the chips in the family were designed 
from the beginning to be compatible, few compromises had to be made 
during the design process, resulting in a single, clean instruction set for all 
the processors, and a clean hardware interface among the processors and 
support chips. This clean structure means that computers using the 68000 
family are inherently easier to design, produce and upgrade. 


2 


The 68000 
Instruction Set 


Daniel Appleman 


This chapter provides an introduction to the instruction 
set of the 68000 microprocessor family. This summary 
is not intended to be a complete reference work; rather, 
it is aimed at the experienced assembly programmer 
who wants to get a general feel for the 68000. 


I: general, a 68000 assembly language instruction has three parts. In a 
source listing, each instruction takes the form 


op source, destination 


where op represents the mnemonic operation code (opcode) for the instruction, 
and source and destination usually represent register numbers or data addresses 
in memory. Many instructions allow the use of almost any addressing mode 
for describing the source and destination. When an operation specifies a data 
register, any of the eight data registers may be used. The same applies to 
address registers. 

i My examples will specify arbitrary data or address registers for the sake of 
illustration; keep in mind that all of the registers of a certain type behave 
identically—except for register a7, which serves as stack pointer for program 
flow instructions. Different assemblers have different mechanisms for specify- 
ing numeric base (hexadecimal, octal, decimal or binary). For the sake of 
these examples, most data values are decimal and all addresses are in hex. The 
prefix '$' indicates a hexadecimal value. 


Data Size 

The Motorola 68000 assembler uses a set of opcode suffixes to specify the 
size of the operation. op.b (where op is any instruction opcode) refers to a 
byte operation, op.w and op (no suffix) refer to word operations, and op.1 
refers to longword (32-bit) operations. Most examples in this chapter will use 
16-bit (word) examples. 
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Adding One Data Register to Another 

In this first example, we'll add the contents of data register 0 (dO) to data 
register 1 (d1). Here's how you might do it with byte size data (affecting only 
the low order bytes of the two registers): 


add.b d0,dl 


Because of the suffix ".b" on the add instruction, the assembler program 
will create an instruction for byte data. The three high-order bytes in d1 will 
not be modified at all. The processor's status flags will be set according to the 
result of the byte operation and will not reflect the rest of the word. 


Testing and Branching 
Let's say you do a byte add and want to jump to address label LABEL] if 
the result is 0. The instructions would be: 


add.b dl1,d3 ;add dl to d3 
beq LABEL1 ;branch if low byte of d3 is zero 


If you then wanted to branch to LABEL? if the entire register d3 is equal 
to zero you would need to "test" the value of d3 first. 


tst..1 ds ;Do a long (32-bit) test of d3 
beq LABEL2 ;Branch if d3 is zero 


This is also equivalent to doing a 32-bit comparison with zero: 


cmp.1 #$00000000,d3 ;compare d3 with zero 
beg LABEL2 ;branch if equal. 


Why use the tst instruction instead of comparing with immediate zero? 
Because the tst instruction is shorter and runs faster. 


Immediate Data Addressing 

The previous example introduced immediate addressing, in which the 
instruction includes the source data. The classic example is probably adding a 
constant to a register: 


add #500,d1 zadd 500 to dl 
Since this is a word size operation (no size suffix), the high-order word 


will not be affected by the operation and the status flags will depend only on 
the results of the word operation. 
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Address Register Indirect Addressing 

The address registers were designed primarily to support complex address 
operations and thus have limited arithmetic capability. What capability they 
do support is related directly to supporting the addressing function. For 
example: 


movea dl,aO ;move dl into a0 


Note that operations on address registers affect the entire register (unlike 
operations on data registers). In this case we have a 16-bit move to an address 
register. The 16-bit data is sign extended before being moved to a0. This 
means that the sign bit of the 16-bit value in dl is duplicated into the high- 
order 16 bits of the 32-bit value stored into a0. Longword data (movea.|) 
would have been transferred directly. Byte operations on the address registers 
are illegal. 

To use address register indirect addressing, you first load the address of the 
data you want to access into the address register, as above. Next, you access it 
through the address register. For example: 


move.1 (a0),d2 ;move longword into d2 


moves a longword (4 bytes) of data from memory at the address contained 
within a0 into d2. 


Absolute Short Addressing 

You have a serial input port at address $3012 and want to read its contents. 
Since it is an absolute address whose value can be expressed by a 2-byte word 
(positive or negative value within 32K of $0), you can use the Absolute 
Short addressing mode: 


move.b $3012,d4 j;get value at serial port 


Most ports on the 68000 bus are likely to be 8- or 16-bit ports and work 
as described above. Many 68020 ports are likely to be a full 32 bits wide, in 
which case the entire register will be affected. 


Absolute Long Addressing 

The absolute short addressing mode is limited to a 64K address space (16- 
bit address). The address specified is sign extended to 32 bits before being 
used. For instructions using larger addresses you would use absolute long 
addressing. For example, the equivalent of the above operation for a port at 
location $A3012 would be: 


move.b $a3012,d4. ;get data from serial port 
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Implementing a Stack 

Let's say you have an application in which you need to use a stack in 
memory. You might use address register a4 as a pointer to the stack, which 
will grow from high memory to low memory. The pointer is defined as 
pointing to the top of stack (the lowest address that holds actual data). For 
this example, we'll restrict the values on the stack to values from 0 to 10000. 
We'll put -1 into the bottom of the stack to aid in detecting errors. This 
particular stack consists of 16-bit words only. Remember that these 
restrictions are only for the sake of demonstration and are not related to any 
hardware limitations. To be specific: any register (a0 through a7) could be 
used as stack pointer. The stack can use bytes, words and long words and need 
not be restricted to one type. Limitations on values of the stack contents are 
related only to the application and to the word size used. 

Let's put the stack at address $3f0020: 


mov.1 #$3£0020,a4 ;load initial top/bottom of stack 
Now let's initialize the current top of stack with -1: 
move #-1, (a4) ;load -1 into the addr pointed to by a4 


The addressing mode (a4) indicates that a4 contains the memory address to 
be accessed. In the next example I'll demonstrate how to move data to and 
from the stack. 


Address Register Indirect With Predecrement 

Now let's say you want to push the number 25 onto the stack. The 68000 
family supports an addressing mode which decrements the pointer before 
storing the value on the stack. 


move #25,-(a4) ;Push 25 onto the stack 


First, the value in register a4 is decremented twice (because a word 
occupies two bytes), then a word containing the value 25 is stored at the 
location pointed to by a4. 


Address Register Indirect With Postincrement 

Now you want to pop the value off the top of the stack and place it in d3. 
The 68000 makes this easy by supporting an addressing mode in which a 
pointer is incremented after pulling the value from the stack. 


move (a4)+,d3 ;Pop the top of stack into a3 


Since this was a word operation the pointer was incremented twice. Both 
the predecrement and postincrement mode understand the size of the operation 
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and will add or subtract 1 for byte operations, 2 for word operations and 4 for 
longword operations. 


The chk Instruction 

Let's continue using the example of the stack introduced above. A com- 
mon occurrence in such an application would involve checking the stack for 
error conditions. An error condition in this example would be attempting to 
read the bottom of the stack or a value that exceeds the limit—in this case 
10000. Before popping the stack you can use the 68000 chk instruction to 
check the data: 


move (a4) ,da3 7;move the top of stack into d3 
chk #10000,d3 ;check d3 


When the chk instruction is executed, if the 16-bit word in d3 is smaller 
than 0 or larger than the source operand, a trap exception occurs to the chk 
trap vector. 

At this point you've probably spotted some limitations to the chk 
instruction. What if this is an application program that cannot use exception 
vectors effectively? What if this is a 32-bit stack? Clearly this instruction is 
designed for operating system support. A realistic application might be a 
table of file handles. An allocation routine provides a file number from 0 to 
16. The number returned by the routine can be checked with the chk routine 
and verified. In such a situation a system trap might be the appropriate 
response to an error. A user application could use a pair of cmp (compare) 
instructions to check the bounds. On a 68020 system the cmp2 instruction 
could be used. For example: 


cmp2 $1000,d3 ;check d3 against bounds at $1000 


Here, the processor expects to find bounding values for the comparison at 
memory location $1000. The specified memory location contains two data 
items—a lower and upper bound. In this example, the lower limit, -1, would 
be at $1000 and the upper limit, 10,000, would be at $1002. The size of the 
items depends on the size of the operation. The size can be byte, word or 
longword. The cmp2 instruction causes the status flags to be set: The Z (zero) 
flag is set if d3 is equal to either bound. The C (carry) flag is set if d3 is out 
of bounds. A similar 68020 instruction, chk2, causes a trap exception for the 
out-of-bounds condition. 


Address Register Indirect With Displacement 
Let's say you want to access the third character in a string, which starts at 
address $a349b, and load it into d5. Here's how you might do it: 


move.1 #$a349b,a2 ;Load the base address into a2 
move.b 3(a2),d05 ;Load the char at (base+3) into d5 
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This addressing mode is most commonly used when dealing with struc- 
tures or tables. Positions of elements in a structure are usually defined at 
compile time, thus they can be accessed efficiently using a base address and 
offset. The offset is a sign extended 16-bit value. This example can easily be 
extended to demonstrate word and longword arrays—just substitute op or op.1 
for op.b in every command. 


Address Register Indirect With Index 

Fixed displacements are fine for accessing elements with fixed offsets, but 
many string applications require dynamic access to characters in the string. 
The address register indirect with index mode is ideal for this application. A 
good example is searching for the last character in a string. For this example 
let's assume that the last character in the string is defined by the null character 
(value of zero). This convention is used in the C language. The string starts 
at address $75b96. Let's find the end: 


mov.1 #$75b96,a2 ;put base address in a2 


clr a4 ;set d4 to zero 
A: tst.b 0(a2,d4) ;test char number (d4) 
beq DONE ;If zero then exit 
addq #1,da4 ;add one to d4 
bra A ;Branch to A 
DONE: move a2,a4 ;Copy base address to a4 
adda d4,a4 ;Add number of bytes scanned 


Register a4 now points to the last character (the null character). Register 
d4 contains the number of characters in the string (not counting the terminat- 
ing null character). The maximum string length that this routine can handle is 
32767, because the addressing mode uses a 16-bit sign extended displacement 
as the index. 

The addq instruction used in this example differs somewhat from the 
standard add instruction. It is used to add values from 1 to 8 to the destination. 
Within this limitation it is significantly faster than the regular add command, 
because the 3 bits needed to represent the values 1 to 8 are included in the 
instruction's opcode. Thus the addq instruction fits into a single word. The 
regular add command needs a second (or second and third) word as a parameter. 


Searching for a Free Structure 
in an Array of Structures 

In the example above, you may have noticed the 0 in front of the (a2,d4). 
The indirect with index addressing mode allows the programmer to include an 
8-bit displacement to be added to the address. This can be used when accessing 
an array of structures. Let's modify the example to search through an array of 
structures, each 100 bytes long. In each table byte 53 will be zero if the table 
is available for allocation. We'll use the same base address as before: 
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mov.l #$75b96,a2 ;put base address in a2 


els d4 ;set d4 to zero 
A: tst.b 53(a2,d4) ;test byte 53 in current structure 
beq DONE ;If zero then exit 
add #100,d4 s;next structure 
bra A 7;Branch to A 
DONE: adda d4,a2 ;Add d4 to a2 


Register a2 now points to the base of the first available structure. As you 
can see, this addressing mode provides for very efficient access of fairly com- 
plex data structures. 


Program Counter Relative 


Many of the examples I've used so far include branch instructions. Most 
68000 branch instructions use an 8- or 16-bit displacement from the current 
program counter. This corresponds to the standard relative branch available in 
almost every microprocessor. Most 68000 assemblers automatically use the 
short displacement whenever possible. 


Program Counter Relative With Index 


This mode can be used to efficiently implement jump tables. Say, for 
example, that we have a table of 32-bit subroutine addresses and we want to 
jump to the routine whose address is stored in the table at the entry number 
contained in register d2. This type of application is very common in operat- 
ing systems and is used to allow a user to access system routines via a trap 
mechanism. The code might look something like this: 


7d2 contains an integer: 0 - maximum of the number of the routine 
;to access. TABLE is a table of 32 bit addresses. 
lsl #2,d2 ;Multiply d2 by 4 by shifting left 
7 (each address is four bytes long) 
jmp TABLE (pc,d2.1) 
7jump to addr offset $d2 bytes 
;from the start of TABLE 
TABLE: dc.l (address of routine 1) 
dc.l (address of routine 2) 
dc.l (address of routine 3) 
». ete. 


Some new stuff here: The Isl instruction does a logical left bit-shift of d2, 
shifting bits out of the high-order end and shifting zeroes into the low order 
end. The immediate argument (#2) says to shift by 2 bits. The net effect is to 
multiply d2 by 4. Since the displacement of the PC-relative indirect jmp 
instruction is limited to an 8-bit value, the array TABLE must be within 127 
bytes of the jmp command. This limitation can be easily dealt with by the 
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assembly language programmer. Not every compiler will be able to take ad- 
vantage of it, however. 


68020 Addressing Modes 


As you can see, the 68000 has a powerful and comprehensive set of 
addressing modes. When Motorola went on to develop their 32-bit micro- 
processor, the 68020, they began with these addressing modes thus ensuring 
compatibility with the 68000. But they also took them a step further. The 
next few examples deal with addressing modes available only on the 68020. 


Register Indirect 
With Index (8-Bit Displacement) 

You create a lookup table containing 91 entries. Each entry contains the 
value for sin(x) where x is a value in degrees from 0 to 90. Each entry is a 
64-bit floating-point number (8 bytes). Given the parameter in degrees in 
register dO, load d1 with the high-order 32 bits and d2 with the low-order 32 
bits of sin(d0). Register a0 contains the base address of the table. To do this, 
we can use Address Register Indirect With Index (8-Bit Displacement). This is 
a variation of the original 68000 indirect with index mode in which the index 
register can be multiplied by a scale factor of 1, 2, 4 or 8 before adding it into 
the effective address. 


7sin(d0) table look up. a0 contains base address of the table 
move.l 0(a0,d0*8),d1l ;move contents of 0+a0+d0*8 into dl 
;note that each entry is 8 bytes long 
move.l 4(a0,d0*8),d2 ;move next long word into d2 


A very similar mode, Program Counter Indirect With Index (8-Bit Dis- 
placement), can be used to put the table in program memory. It can also 
simplify the jump table demonstration from the previous example: 


;a2 contains an integer 0 - maximum of the number of the routine 
;to access. TABLE is a table of 32 bit addresses. 
jmp TABLE (pc,d2.1*4) ; jump to address whose value 


zis stored (d2) bytes from TABLE 


The shift is no longer needed, because it's done automatically within the 
instruction. 


Register Indirect With Index (Base Displacement) 

A more significant improvement was made in creating an addressing mode 
which is a superset of the original address register indirect with index mode by 
eliminating the 8-bit displacement limitation. The new address register 
indirect with index (base displacement) and its program space equivalent, 
program counter indirect with index (base displacement) permit the dis- 
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placement to be up to 32 bits long. Moreover, each of the parameters is 
optional. So the following addressing examples are all legal on the 68020: 


Full version: displacement(address register, index register * scale) 
Example: $3589a(a5,d2*4) 


Without displacement: O(address register, index register * scale) 
Example: 0(a2,a3*8) 


Without base address: displacement(index register * scale) 
Example: $29a3(d4*2) 


Any combination of these would also be valid. Interesting how they 
sneaked the possibility of indirection on data registers into this mode isn't it? 
These two addressing modes remove certain limitations in the 68000 version, 
but are not radically new. Both of these addressing modes can be used with 
PC relative addressing to perform similar operations in program address space. 


Memory Indirect Pre-Indexed 

In this example we'll define a structure called SEGID, which contains 
information about segments of memory. A table of 32-bit address pointers, 
SEGTABLE, contains the addresses of the currently allocated SEGID struc- 
tures. SEGID could be defined as follows: 


Offset Data 
0 Length (16 bits) 
p) Status (16 bits) 
4 Start address (32 bits) 
8 End address (32 bits) 


A block like this one appears at the beginning of each memory segment. 
SEGTABLE points to these blocks. In this example we want to find the 
segment whose number appears in register d5 and move the start address to a2 
and the end address to a3. This can be done very efficiently using memory 
indirect pre-indexed addressing: 


move.l a0,SEGTABLE 7get base address of SEGTABLE 
move.1 ({0,a0,d5*4]4),a2 ;load start address 
move.1 ({0,a0,d5*4]8),a3 ;load end address 


Here's what this does: Add the base address a0 to an offset (0 in this case). 
Add the offset to the pointer (d5*4). Use this address to read a pointer—add 
the outer displacement to this pointer and use the new address to read the 
actual data to move. What we have here is two levels of indirection. A pointer 
in memory is used as one of the components to form the final effective ad- 
dress of the operand. 
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Memory Indirect Post-Indexed 
In this example we implement the segment table using a linked list instead 
of a segment table. The SEGID structure will be defined as follows: 


Offset Data 
0 Length (16 bits) 
2 Status (16 bits) 
4 Start address (32 bits) 
8 End address (32 bits) 
12 Pointer to next SEGID (32 bits) 
16 Pointer to previous SEGID (32 bits) 


In this case, register a0 points to an arbitrary SEGID and we wish to move 
the start and end address of the next SEGID into a2 and a3. This can be done 
with memory indirect post-indexed addressing as follows: 


move.1 ({12,a0]4),a2 ;load a2 
move.l ([12,a0]8),a3 ;load a3 


We add register a0 to the offset of 12 (pointer to next SEGID) then use the 
address found there as a pointer to the next SEGID. By adding 4 or 8 we select 
the start or end address. This actually does not demonstrate the full power of 
the addressing mode. Let's modify the example. Instead of specifying the start 
or end address we want to chose one of the four addresses depending on the 
value in register d2. If d2 is 0 let's get the start address; if d2 is 3 let's get the 
pointer to the previous SEGID, and so on: 


move.1 ({[12,a0]d2*4,4),a2 


We get the pointer to the next SEGID as before, we add the constant 4 to 
get past the length and status words, then use the address specified in d2 after 
scaling by 4. 

Remember that both memory indirect addressing modes are also available 
in program counter relative versions which access program memory. 


Moving a String in Memory 

Moving data quickly is one of the most important tasks any computer 
faces. Some microcomputers are said to be able to efficiently move entire 
strings with one instruction. What is not mentioned is that the single instruc- 
tion requires that the initial move information be set up in specific registers 
that are dedicated to string moves. By using limited "string move" registers 
the advantages of orthogonality are lost. The 68000 family keeps the orthog- 
onality, allowing you to use your choice of registers to set up for the move. 

In this example we have two strings, STRING1 and STRING2. We wish 
to copy STRING1 into STRING2. We also know the number of bytes to 
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move, stored in NUMBEROFBYTES. Assume that STRING1, STRING2 
and NUMBEROFBYTES are locations in memory that contain the 
information. 


;copy NUMBEROFBYTES bytes from STRING1 to STRING2 
;In this routine NUMBEROFBYTES >= 1 


move.l STRING1,al yany address reg would do here 

move.l STRING2, a2 

move NUMBEROFBYTES, dO 

subg #1,d0 7;Set the counter to NUMBEROFBYTES-1 
dothecopy: 

move.b (al)+, (a2)+ ;move a byte - increment al and a2 after 

dbf d0,dothecopy ;Decrement d0, loop if d0>=0 


The DBF instruction is one of a group of decrement and conditional branch 
instructions that are useful for this sort of loop. 

Let's modify the example. This time we know that the string has a 
termination byte of 0. The maximum string length is 5000. In this case we 
want to copy STRING1 to STRING2 stopping at the end of the string (byte 
with a value of 0 found) or at 5000, whichever is greater: 


;copy STRING1 to STRING2, up to 5000 bytes 


move.l STRING1,al yany address reg would do here 
move.l STRING2,a2 
move #4999,da0 ;set up maximum count-1 
dothecopy: 
move.b (al)+, (a2)+ ;move a byte - increment al and a2 after 
dbeq d0,dothecop ;Decrement dO - If char not zero then 


;If d0>=0 goto dothecopy 


The dbeq instruction is a form of the decrement and conditional branch. It 
works as follows: If the condition tested is true, the instruction does nothing 
(falls through to the next instruction). If the condition is false, the specified 
register is decremented. If the result is not -1, a branch is taken to the spe- 
cified location. This two-command set is slower than some of the com- 
petition's single instruction moves. But Motorola added improvements on the 
68010 and 68020 to eliminate even this small disadvantage. The 68010 has a 
two-instruction cache that detects a tight loop like the one above and keeps 
the instructions inside the CPU so that they do not need to be fetched each 
time. In the 68020 the problem does not exist at all due to the 256 byte on- 
chip instruction cache. 

The previous example can be modified to search for a particular byte: 
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;search STRING1 for a linefeed ($0a), up to 5000 bytes 


move.l STRING1,al ;point to string 
move #4999,d0 ;set up maximum count 
dothesearch: 
cmp.b #$0a, (al)+ ;Compare a byte - increment al after 


dbeqg d0,dothesearch ;Dec dO - If the byte not $0a then 
;if d0>=0 goto dothecopy 


Complex Addressing With the 68000 

The fact that the 68000 chip is missing some of the 68020's complex 
addressing mode does not mean that those same functions are not still needed. 
Take, for example, the 68020 operation below, which uses memory 
indirection: 


move.l ({$1d,a0,d5*4]4),a2 


We can implement the same instruction using a temporary register—say 
a5: 


sil #4,d5 ;do the scaling 
move.1 $ld(a0,d5),a5 ;get the address 
move.1 4(a5),a2 ;get the data 


Stack Frames 

The 68000 was designed to support high-level languages efficiently. One 
of the places in which this is best demonstrated is in the handling of stack 
frames. There must be a mechanism for passing parameters to procedures in 
high-level languages. In addition to this, a procedure will often have local 
variables that are private to that procedure only. Both requirements can be 
handled effectively using a mechanism called a stack frame. The main stack 
pointer on the 68000 is one of the a7 registers. Normally the calling routine 
will push parameters onto the stack then execute a jsr (jump subroutine). For 
this example we'll use two parameters. A 32-bit integer in d5 equal to 
$17601, and a 32-bit address represented by effective address 8(a5,d4): 


move.1 d5,-(a7) ;Push integer param d5 onto stack 
pea 8(a5,d4) ;Push the effective address 8(a5,d4) 
jsc NEWPROCEDURE ;call the subroutine 


The pea (push effective address) instruction takes the calculated effective 
address of the pointer and pushes it onto the stack. The lea (load effective 
address) instruction is similar, but puts the effective address into an address 
register. For the purposes of this example, let's assume that the effective 
address of 8(a5,d4) evaluates to $450000. If the stack started at $a3020, it 
might now look as follows: 
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$a3020 (undefined) top of stack 
$a301c $0001 7601 first parameter 
$a3018 $00450000 second parameter 
$a3014 (return address) 


The stack pointer (a7) would now contain $a3014. So far so good. Now 
we need a mechanism to provide for local variables. In this example we'll use 
a fairly simple model—that of the C language. In C, local variables are not 
available to other procedures. We'll use a6 as a stack frame pointer (this is a 
very common use for a6). For this example let's define three local variables. 
VARI and VAR2 are long integers (4 bytes each). VAR3 will be a pointer. 
The called procedure will set VAR1 to $25, VAR2 to the value of the integer 
passed as a parameter to the routine ($17601) and VAR3 to the value of the 
pointer passed to the routine ($450000). Equivalent C code for this example: 


/* in calling routine */ 


long integerind5; char *pointerasdescribed; 
NEWPROCEDURE (integerind5, pointerasdescribed) 


NEWPROCEDURE ( PARAM1, PARAM2) 


long PARAM1; /* The first parameter was 16 bits */ 
char *PARAM2; /* The second parameter is a pointer */ 
{ 
long VAR1, VAR2; /* Define VAR1 - VAR3 */ 
char *VAR3; 
VARI = 0x25; 
VAR2 = PARAM1; 
VAR3 = PARAM2; 
INPROCLABEL: /* label to break the example into two parts */ 
return (VAR2) ; 


Here's the assembly code to implement NEWPROCEDURE up to 
INPROCLABEL: 


link #-12,a6 ;create local variable space 
move.1 #$25,-4 (a6) ;VAR1 = $25 
move.1 12(a6),-8(a6) ;VAR2 = PARAM1 


move.1 8(a6),-12(a6) ;VAR3 PARAM2 

The key instruction here is the link instruction. It takes a frame register 
(here it's a6) and a (usually) negative displacement as parameters and has three 
parts. First it pushes the current frame register address onto the stack (old a6 
in this case) Next it moves the current stack pointer into the frame register. 
Last, it adds the displacement to the stack pointer. In most cases you'll want 


34 DR. DOBB'S TOOLBOOK OF 68000 PROGRAMMING 


\ 


to use a negative value here because the stack grows downward in memory. 
The displacement is equal to the total space allocated to local variables. This 
results in the stack pointer correctly set to the end of the stack after local 
variables are allocated. The parameters passed to the routine can be accessed as 
positive offsets to the frame register and local variables can be accessed as 
negative offsets. Using this mechanism it is clear that other procedures can be 
called without risking loss of data. In addition, recursion is possible. 


Getting out of Stack Frames 
In this example we'll assume that results are returned in register dO. 
Getting out of the stack frame works something like this: 


move.l -8(a6),d0 ;return(VAR2) 
unlk a6é ;restore original stack frame 


The unlk instruction has now effectively freed everything below the return 
address. Unlk accepts one parameter—the frame register. It inverts the link 
command: The current frame register is loaded into the stack pointer, then the 
value of the old frame register is popped off the stack into the frame register. 
All we have to do now before we can return to the calling procedure is to get 
rid of the parameters. There are two ways to do this. One is the rtd command 
(available on the 68010 and the 68020 only): 


rtd #8 7;Pop stack into program counter (return) but 
;add displacement to the stack pointer before 
;continuing. 


This is an efficient solution to moving the stack pointer past the 
parameters but it does have a major disadvantage: In order for the called 
routine to pop the parameters it must always know how many there were. 
Thus the number of parameters is fixed and all parameters must always be 
used. Since the C language allows variable numbers of parameters, it is clear 
that a different solution is needed. With C stack frames routines commonly 
end with a simple rts (return from subroutine). In that situation, the 
instruction after the jsr jump to subroutine) would be something like: 


addq #8,a7 ;pop off two long word parameters 


The calling routine always knows (hopefully) how many parameters were 
originally placed on the stack and thus knows how many parameters to pop 
off the stack when the called procedure returns. 


Synchronization Between Processors 

Many 68000 systems involve multiple processors. Not only can multiple 
68000 chips be present on a single bus, but many applications use slave 
processors to handle I/O or specialized applications. Disk drives, terminals, 
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graphics devices and other ports are all common devices in 68000 systems In 
many of these situations it's necessary to provide for common data areas in 
memory for the processors to communicate with each other. 

In the case of serial I/O, for example, the application might maintain an 
output buffer for data to be sent. Another task might handle the actual 
hardware control and empty this same buffer as the data is sent. This other 
task may be a second task under a multitasking operating system, an interrupt 
routine or perhaps an intelligent peripheral controller that shares the system 
bus. We'll use the example of an intelligent peripheral because it best 
illustrates some of the problems and solutions arising from this situation. 

Assume that the main processor places characters into shared memory, 
while the slave processor reads them and sends them out over the serial port. 
The problem relates to synchronizing the two processes. Normally they run 
completely asynchronous to each other, meaning that the second task can take 
over the bus (to read or write data in the shared memory) at any time. Imagine 
that the slave processor checks for an available character at the same time as 
the master processor stores data into shared memory. If the write operation 
involves any sort of pointer update by the master CPU, there's a fair chance 
that the slave CPU will detect a character based on the pointer before the 
master has actually placed the character into the buffer. This could lead to a 
garbage character being sent instead of a valid character. Even if the odds of 
this occurring are only one in a thousand, this would represent a 0.1% error 
rate, not good at all. In situations where the two processes are manipulating 
system information this error rate in a system executing millions of 
instructions each second could lead to constant system crashes. 

The solution for this (and many similar problems) is to use a mechanism 
to "lock" the buffer so that only one of the processes can access it at a given 
time. In this case we'll define a byte within the buffer area called FLAG- 
BYTE. If the byte is not zero the buffer is in use and cannot be modified. If 
the byte is zero, the buffer is available. Each process calls a function to 
obtain access to the buffer before reading or writing data: 


;function that waits for buffer to become available, then locks 
vit so that no other function can use it. 
obtain-buffer: 

tas FLAG-BYTE ;test and set FLAG-BYTE 

bne obtain-buffer ;If byte is not zero - loop back 


The tas (test and set) instruction tests the byte and updates the status 
register. At the same time it sets bit 7 in the tested byte. The important thing 
about tas is that it is "indivisible." It uses a special bus cycle called a read- 
modify-write cycle. Interrupts are deferred until after the cycle is complete. A 
bus lock occurs that prevents any other device on the system from accessing 
the bus during this cycle. The 68020 has two additional instructions, cas and 
cas2 (compare and swap), that are more complex versions of tas. They are 
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able to compare and modify a byte, word or long word variable. This can be 
useful when doing arbitration between more than two processes. 


Bit Manipulations 

Many applications do not limit themselves to manipulating bytes, words 
and long words. Individual bits and groups of bits (bit fields) are often used to 
represent values with limited ranges. The 68000 supports a full complement 
of bit manipulation operations. Operations include bchg (test bit and change), 
belr (clear bit), btst (test bit) and bset (set bit). The 68020 supports a more 
sophisticated set of instructions that affect more than one bit. A bit field is an 
array of 1 to 32 bits located anywhere in memory. The bit field can cross byte 
or word boundaries. It is defined by a base effective address, an offset of 0-31 
specifying the start bit of the bit field and the length (1-32) of the bit field. 
The offset and length can be included as part of the instruction or contained in 
other data registers. Operations on bit fields include complement, clear, 
extract, insert, find first bit set, and test. This is probably the most sophis- 
ticated set of bit manipulation commands in any 32-bit processor. 


For Further Reading 

This chapter does not cover all the details of the 68000 instruction set. For 
full descriptions see the following publications: 

M68000 16/32-Bit Microprocessor: Programmer's Reference Manual, 4th 
ed. Englewood Cliffs, NJ: Prentice-Hall, 1984. 

Osborne 16-Bit Microprocessor Handbook. Berkeley, CA: Osborne/ 
McGraw-Hill. 1981. 

Robinson, Phillip R. Mastering the 68000 Microprocessor. Blue Ridge 
Summit, PA: TAB Books, Inc. 1985. 

Scanlon, Leo J. The 68000: Principles and Programming. Indianapolis, 
IN: Howard W. Sams & Co., Inc. 1981. 
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The 68000 Family 


Daniel Appleman 


The 68000 CPU and its bigger sisters are not the only 
chips in the family. Daniel Appleman uses a hypothetical 
example system to describe the uses of the additional chips 
in the 68000 family, including the special coprocessor 
support available with the 68020 processor. 


y itself, a 68000 processor is a fairly useless entity. Combined with 
B peripherals and support chips, however, these chips form some of the 
most powerful microcomputer systems in the industry. 

There are two philosophical extremes to building a microcomputer system. 
One approach is the "minimal system": the simplest (and least expensive) 
support chips are used, and the CPU does as much of the work as possible. 
Apple's Macintosh leans in this direction, although Apple did design some 
custom chips to speed up certain operations. The other extreme offloads much 
of the computation onto various specialized processors, leaving the main 
CPU as a master controller. This approach, which has become much more 
common as hardware costs have declined, will serve as our model as we tour 
the various 68000 support devices by designing an example system. 


Coprocessors vs. Peripherals 

Designing our system will consist largely of defining the operations we 
want to support and then selecting the integrated circuits that best support 
that functionality. Most of these chips will be peripheral chips—chips 
present on the CPU bus and largely controlled by the main processor. From 
the programmer's viewpoint, most peripheral chips appear as one or more 
addresses in memory that can be written to and read from. But things change a 
bit if we're using the high end of the family. With the introduction of the 
68020, the 68000 family could support a somewhat different type of 
peripheral chip: the coprocessor. 

A coprocessor differs from a regular peripheral chip in that it directly 
enhances the operation of the main processor, from the programmer's point of 
view. The operations and data types supported by the coprocessor become 
extensions to the CPU's original operations and data types, and the 
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programmer can work directly with additional machine instructions. That is 
possible because in addition to the coprocessor instructions, the 68020 
supports a special coprocessor interface mode directly in hardware. It is also 
possible to use the coprocessor chips with the 68010, but it requires more 
work on the programmer's part, since the 68010 supports only the basic 
hardware interface and not the coprocessor instructions. The 68000 chip 
cannot support coprocessors at all without additional hardware support. In this 
case the 68000 uses the coprocessor as a standard peripheral chip. 

The 68020 also supports an emulation mode in which the programmer can 
use the new coprocessor instructions even though the hardware is not yet 
present in the system. The instruction is trapped and redirected to the 
appropriate emulation code, permitting the same code to run on a variety of 
systems, from the basic single processor to one with up to 16 coprocessors. 

When working with a coprocessor, the programmer sees a set of new 
machine language instructions related to that device. The assembler (which 
must be designed to support the new instructions) compiles them into the 
following 68020 instructions: 


cpBcc Coprocessor branch on conditional 

cpDBcc Decrement and branch based on the coprocessor conditional 
cpGEN General coprocessor function 

cpRESTORE Restore the internal state of the coprocessor 

cpSAVE Save the internal state of the coprocessor 

cpScc Set a flag depending on the coprocessor condition 


cpTRAPcc _ Trap if specified coprocessor condition is true 


Each instruction includes the coprocessor ID. The meaning of the 
instruction depends not only on the function specified, but also on the type of 
coprocessor being accessed. cpRestore and cpSave can be used to save and 
restore the current context of a coprocessor—a useful function for exception 
handling and multiuser systems. 

Because of the transparent nature of the coprocessor instruction, most non- 
system programmers will never use these instructions directly. Instead, the 
assembler will generate them from the mnemonic for that coprocessor. 


Floating-Point Support 

One of the first enhancements to consider on a system intended for 
scientific or computational applications is floating-point support. Before large 
scale integration, most floating-point operations were provided by subroutine 
libraries written in machine or high-level languages. Recently it has become 
more common to implement hardware floating-point processors. These pro- 
cessors achieve a significant improvement in speed over software solutions. 
Not only do they have the performance inherent in a custom hardware design, 
but their architectures support very wide internal registers (Commonly 80 
bits), which means they can deal with the entire floating-point number at 
once with speed and precision. 
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The MC68881 is Motorola's floating-point coprocessor for the 68000 
family. Like other coprocessors, it provides a transparent extension to the 
capability of the main processor. In other words, no special programming is 
required once the assembler supports the new instructions and data types. In 
addition, the 68881 supports the same addressing modes as the 68020 (not 
surprising, since the 68020 is responsible for calculating effective addresses). 

The 68000 supports the following basic data types: 


Byte Integer (xxx.B) 
Word Integer (xxx.W) 
Long Integer (xxx.L) 


Once added to the system, the 68881 provides the following floating-point 
(real) data types: 


Single-Precision Real (xxx.S) 
Double-Precision Real (xxx.D) 
Extended-Precision Real (xxx.X) 
Packed-Decimal Real (xxx.P) 


The 68881 also extends the physical resources available to the programmer 
by providing an additional 8 floating-point data registers (FPO to FP7). Each 
register is 80 bits wide. There are also several control and status registers. The 
single-precision format is 32 bits wide and the double-precision is 64 bits 
wide; both are compatible with the IEEE floating-point standard. Following 
this standard guarantees that data formats are compatible with (and thus can be 
transferred to and from) other systems that conform to it. The extended- 
precision format uses the full 80 bits and is designed for use with temporary 
variables or intermediate results. The packed-decimal format uses three long 
words (96 bits) and consists of a 17-digit mantissa and a 3-digit base 10 
exponent. The processor stores and processes the number in extended format, 
converting to and from packed decimal when necessary. This procedure is 
extremely convenient from the programmer's point of view, since the packed- 
decimal format is the easiest to use in a user interface (few people are 
accustomed to typing, reading or writing numbers in the IEEE floating-point 
format). 

The floating-point commands are used exactly the same way as standard 
commands are used. For example, to add the contents of FPO to FP2: 


FADD.X FPO, FP2 
To add a single-precision number pointed to by AO to FP2: 
FADD.S (AO), FP2 


As you can see in the second example, it is possible to have operands in 
memory also. When you use a memory operand you must specify the format 
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of the number. All memory operands are converted into the extended format, 
and all internal calculations or calculations involving two floating-point 
registers use the extended format exclusively. The 68881 supports a full range 
of mathematical functions including format conversions, absolute and integer 
values, arithmetic, trigonometric and exponential functions, and comparisons. 
Many of the comparison and branching operations of the 68000 family have 
corresponding floating-point commands including FTST (test and set status 
registers), FDB (test floating-point condition, decrement and branch on result) 
and so forth. In addition, the processor tests for illegal commands (divide by 
zero, for example) and invalid numbers, and reports them or causes a trap 
when they occur. 

One last question is important to consider when examining floating-point 
processors: How fast is it? Unfortunately, this question is one of the hardest 
to answer. The time it takes to perform a floating-point calculation depends 
on many factors, including the type of operation, the format, the clock speed, 
whether it is a coprocessor or peripheral, the effective address calculation 
time, and whether or not the instruction was in the cache memory. A typical 
multiplication (register to register, at 16.67 MHz clock) is 98 clock cycles, or 
less than 6 microseconds. Try to beat that in software! 


Memory Management 

When developing more complex systems, especially systems that are 
intended to support multitasking, managing limited memory resources 
becomes critical. Let's take a quick look at the features you might want in a 
memory management system. 

¢ Virtual memory. The 68020 supports a full 32-bit (4-gigabyte) mem- 
ory space. Even with today's low memory prices, 4 gigabytes of fast memory 
is a rather expensive proposition. Nevertheless, it is possible to provide the 
processor with the entire address space by dividing the memory space into 
pages (uniform blocks) and keeping in fast memory only those pages 
currently being used. The remainder of the pages are kept on a slower but less 
expensive storage device, such as a hard disk. The pages are typically 256 
bytes to 32K in size. 

Our memory management unit (MMU) should sit between the CPU and 
memory. Each time the CPU accesses memory (at a logical address) the 
MMU checks to see if that page is in memory. If it is, the MMU allows the 
request to go through, generating a physical address to the memory. If not, 
the MMU causes an interrupt that redirects the processor to a routine that 
loads the desired page into memory from the disk. Then execution of the 
interrupted program resumes, using the freshly loaded memory. In addition, 
the MMU should keep track of any pages in fast memory that are modified so 
that the operating system can determine whether a page needs to be rewritten 
to disk before being discarded (pages are discarded to make room for other 
pages when fast memory is full). 

¢ Address translation and code relocation. As long as the memory 
management is already maintaining a table of page information (address 
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translation table), it might as well keep some additional information around 
to support more complex functions. Imagine a situation with several 
programs running at once. Without memory management, each program will 
have to be relocated to a different address. That is fine if the code was written 
to be relocated, but why place unnecessary restrictions on the programmer? 
Instead, we can let the MMU do the address translation as well as the paging. 
Each program sees itself as running in its own logical address space (usually 
starting at address 0), even though in reality all the programs are located at 
different physical addresses. 

¢ Intertask protection. With physical memory supporting many tasks 
simultaneously, it is essential to provide a logical separation so that no task 
can interfere with another. The address translation table must include access 
information for each page determining which tasks can read, write or execute 
that page. 

¢ Sharing pages. It is often useful to be able to share certain pages in 
memory. Operating system resources are often available to any application. 
Allowing different applications to share code can improve system 
performance. 

Consider the example in which several programmers are using the editor 
on a multiuser system. If it's properly written, the editor program can be 
shared by the tasks. The MMU must provide protection so that each of the 
tasks can execute the shared code, but none of them can modify it. (See 
Figure 3.1.) 

¢ Multiple privilege levels. The memory management system should 
support different levels of privilege. For example, in a sophisticated database 
system, one might want to provide for certain shared data areas. Depending on 
the privilege level of the user, the application should be allowed to read 
certain areas and perhaps modify others. 

Clearly such a memory management system is a valuable tool when trying 
to achieve the full capability of a 68000-based system. Motorola created a 
memory management coprocessor, the 68851, to fulfill this need. 


More 68851 Features 


In addition to the above features, the 68851 supports a tree-structured page 
table. That means that each entry in the address translation table may point to 
either a page or to another entry. This structure provides several advantages. 
On large systems the address translation table itself may be huge, possibly 
even larger than memory itself. By using a tree-structured table, you can swap 
entire sections of the page table out to disk when they are not needed. You 
can also modify or enable entire branches of the table in one operation. For 
example: If each task has its own branch in the table, task protection becomes 
easy. It is only necessary to check the protection on the root table entry for 
that task instead of examining the entry for each of the task's pages. 

The 68851 also has an on-board cache containing the 64 most recently 
used table entries, so that the majority of memory accesses will not involve 
delays due to page-table lookup. In addition, the 68851 accesses memory 
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FIGURE 3.1 Tasks executing shared code. 


independently of the CPU; thus, the CPU does not need to be involved in 
reading or writing the page table entries. 

Like all 68020 coprocessors, the 68851 actually extends the 68020 instruc- 
tion set. All of the instructions necessary to manipulate the page table and 
control memory access become part of the 68020 instruction set. As usual, 
the chip can also be used as a peripheral (with some loss in efficiency) by 
other 68000 family processors. 


Building a System 

So far we've described ways to extend the power of the basic processor to 
provide floating-point support and memory management. Now that we have 
defined a computational system, we can proceed to develop a system to go 
around it. Figure 3.2 is a block diagram of such a system. Keep in mind that 
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it is only one example of an enormous number of possible configurations. 
System architectures are usually designed to maximize performance for a 
particular application. This system has been designed to maximize 
performance in demonstrating some of the peripheral devices commonly used 
on 68000 family systems. 

The system is broken up into four major functional blocks. The first is the 
main computational block, which consists of a 68020 CPU, a floating-point 
coprocessor and a memory management coprocessor. This section would be 
the one visible to the user, the one that would run most user tasks. It is also 
supported by a DMA controller and 68901 multifunction peripheral that will 
be described later. The main computation section has its own private 
memory. It does not, however, include I/O devices or peripherals. 

The next subsystem handles communications. This section uses one or 
more 68681 DUARTs (to be described) to handle communications to and 
from terminals or other serial devices. 

The third subsystem doubles as a file server and global memory bus. It has 
its own 68000 family processor dedicated to controlling any disk and tape 
drives. The shared memory in this section is used as a transfer area for storage 
device data and to allow communications among the different subsystems. 

The fourth subsystem can consist of one or more custom subsystems 
including additional computational systems, communications systems, file 
servers, network servers or others systems. 

These subsystems are tied together by a 68452 bus arbitration module. 
(Chapter 1 describes how the 68000 bus allows several devices to share the 
system bus.) This functionality requires additional support circuitry to 
arbitrate among the processors and accomplish the actual transfer of control. 
The 68452 contains the circuitry to support up to eight bus masters. The 
system is expandable (virtually without limit) through the use of multiple 
bus arbitration modules and allows the designer to specify device priority in 
hardware. 

The remainder of this chapter will consist of more detailed descriptions of 
each section, including descriptions of the integrated circuits used, a typical 
memory map and brief description of some of the subroutines that one would 
use to implement inter-subsystem communication. 


The Computational System 

In our example system, the computational subsystem would probably be 
most effective running a message-based operating system. The processor 
would receive a signal from the message processor when a message is directed 
toward it. It would then transfer the message and associated data to the private 
memory. This subsystem would transfer data in large blocks on a fairly high 
level. The user interface, for example, would deal with entire lines (or pages) 
rather than single characters. 


THE 68000 FAMILY 45 


Data to and from storage devices would take place on a file or page basis; 
the system would not need to deal with the mechanics of disk or tape control. 
Since all communications would be between this system and the shared 
memory, fast data transfer between the two is essential. This is provided by 
the 68440 Direct Memory Access (DMA) controller. The DMA controller is 
programmed with source and destination address information and the amount 
of data to be transferred. It then transfers the data independently of the main 
CPU. Being specially designed for such tasks, the DMA controller is 
considerably faster than the processor for this job. 

The 68901 multifunction peripheral contains several functional blocks that 
are normally implemented with separate chips. The first block consists of 
four timer counters, which could be used to control the memory refresh logic, 
generate timing signals for a serial port, provide a real-time clock interrupt 
and still leave one timer available for user applications or other real-time 
control processes. 

The 68901 also contains a USART (Universal Synchronous/Asynchronous 
Receiver/Transmitter, or serial port), which in this type of system 
configuration would probably be used to support a system or control 
terminal, or perhaps a debugging terminal. It also has a 16-source interrupt 
controller that would probably be used to control the interrupts from the 
various devices on the bus. In this system, interrupts can be generated by any 
of the devices including the FPU, MMU, DMA controller, message handling 
hardware and the 68901 itself. As a final treat, the 68901 contains a general- 
purpose I/O port that could be used for miscellaneous control or perhaps to 
control an LED indicator light on a front panel. 

A typical memory map for the computational section is shown in Figure 
3.3. The operating system would need to support a number of low-level 
routines to support communications with the other subsystems. One of the 
most common methods for multiprocessing systems is a message-based com- 
munication system. There are many ways to implement such a system, but in 
this example we'll assume that the global memory has a message area 
and that each subsystem has its own message table in which it places mes- 
sages that it is sending. 


Operating System Program and Data Area 


Application Memory 


Address Translation Table 


FIGURE 3.3 Memory map for computation system. 
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Some typical routines that might exist in such a system are: 

¢ Get-Message(). There needs to be some mechanism for the various 
subsystems to signal each other that there is information available. Assuming 
that there is some hardware (interrupt) support to indicate that a message is 
waiting, this routine would copy the message information from the global 
message area into that part of the operating system data area that deals with 
messages. It would then mark the message as received so that the space in the 
global message table could be reallocated. 

¢ Send-Message(destination, message information). This routine 
would perform several functions. First, it would copy the message into the 
global memory message table reserved for this subsystem. It would then 
generate a message-available signal to the destination subsystem. The table 
space allocated to this message would not be freed until the target subsystem 
notified the sender that the message has been received properly. A more 
sophisticated system might include a time-out or message monitor that would 
make sure that undeliverable or unreceived messages are dealt with correctly. 

* Read-Page(pageinfo) would be used by the system when a page fault 
occurs. When the MMU notifies the processor (through a page-fault interrupt) 
that a page needs to be read off disk, the CPU finds a free space in memory, 
then sends a message to the file server indicating the page that is needed. The 
file server returns a message when the page is available and the CPU then 
directs the DMA controller to do a fast transfer of the page from global 
memory to local memory. Note that since the CPU no longer needs to do the 
actual disk operations, the CPU can instead suspend the current task (which 
had the page fault) and continue work on another task until the desired page is 
available. 

¢ Write-Page(pageinfo). If the CPU needs free memory space in order 
to load a new page and none is available, the CPU needs to discard an existing 
page. If that page has been modified it needs to be written back to disk. That's 
the job of Write-Page. The page could be transferred to the global message 
table area for the file/disk server to process at its convenience. Once again 
performance improves as the CPU can proceed to other operations while the 
file/disk server does the actual writing to disk. 


The File Server and 
Shared Memory Subsystem 

This system contains a global shared memory available to all subsystems. 
It has a dedicated 68000 family processor used to support file, disk and tape 
operations. The 68000 might run out of the global memory, or in a more 
complex system have its own private memory, in which case it would access 
global memory through the 68452 Bus Arbitration Module in the same way 
as the other subsystems. The subsystem would also have dedicated hard disk, 
floppy and tape controller chips to support the low-level control and 
interfacing to the peripheral devices. 

The file server would operate on several levels. At the highest level it 
would deal with entire files. Other subsystems would only need to open, 
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close, read and write files or parts of files. All of the directory manipulation, 
disk handling and low-level control would be done by the file server, thus 
taking an enormous load off the computational subsystems. On the next level 
the file server would maintain the virtual-memory image for those 
subsystems that use virtual memory. The server should be able to maintain 
separate virtual spaces for each of them. The lowest level would be for those 
rare occasions when very low-level (individual sector) or direct-device access is 
required. 

This section could also be responsible for controlling the message- 
signaling system—for examining all incoming messages and then signalling 
the appropriate subsystem that a message is available. 

Figure 3.4 shows a typical file server memory map. The File Server 
section would have a number of routines to handle typical file operations. 
Having an independent file server does, however, present several other 
interesting possibilities. Such problems as maintaining file/directory 
consistency (a common problem on Unix-type systems) can be handled by the 
file server without the intervention of other subsystems. Disk backup can 
also be handled directly and accomplished with minimal interference to other 
tasks. 

The file server could also be expanded into a full-scale network server. In 
such a system, the other subsystems could access resources across a network 
without requiring that the user be aware of the process. The 68000 has more 
than enough power to handle these tasks. 


Comp. Subsystem Message Table 


Comm. Subsystem Message Table 


File Subsystem Message Table 


Disk/Tape Buffer Area 


Local Code and Data 
(If server is running out 
of global memory) 


FIGURE 3.4 File server memory map. 
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Communications Subsystem 

The last clearly defined subsystem in our demonstration system is designed 
to support serial communications. There are hundreds of serial I/O chips that 
can be used in such a configuration, but in this case the 68681 dual UART is 
specified. Each of these chips contains two serial ports, each capable of data 
rates of 1 megabyte per second. The system has double buffering on output (2 
bytes of RAM within the chip to store data), quadruple on input— 
minimizing the chances of losing data. 

Serial communications actually requires a great deal of processor power. If 
no buffering existed on the DUART, an interrupt would occur every time a 
character was received or sent. Take a situation in which the subsystem is 
controlling 16 terminals at 9600 baud. This represents a worst case transfer 
rate of about 16,000 characters per second—which means an I/O interrupt 
every 62.5 microseconds! Few operating systems can process a character this 
quickly. Most systems cannot handle multiple users, disk access and data 
communications at high speeds without occasionally dropping a character. 
The problem is eliminated in our example system because the 68000 can run 
some extremely efficient character-handling routines. Combined with the 
68681's on-chip buffering, our system can reliably use extremely high data 
rates. 

Not only can the communication subsystem improve communication 
reliability, but it can offload a great deal of work from the computational 
subsystem. In the case of block data communication (file transfers, etc.), the 
processor can take care of all protocols, error correction, data retransmission 
and so forth. Only when the block has been transferred correctly does the 
subsystem notify the computational system that a block has been received or 
sent. 

In the case of communication with a user at a terminal, the line protocol 
can be handled directly. Character echo, backspaces, local editing and other 
terminal features can be handled locally. Transfer to and from the 
computational subsystem can take place by line or by page. Each user or data 
link would have its own private buffer. 

Figure 3.5 shows a typical memory map for such a communication sub- 
system. The communication subsystem might include some of the following 
support subroutines: 

¢ Assign-Channel(task-identifier) would assign a physical serial 
port to a specific task. It would be called by the high-level (or operating 
system) program in the computational subsystem and be handled by the 
communications subsystem. A new buffer would be allocated in local 
memory and the appropriate communications link would be established. 

¢ Send-Data(channel) would be used to transfer blocks, lines or pages 
of data to and from the communications subsystem. Additional routines to 
establish protocols, communications modes and so on would also need to be 
provided. 
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Communication Buffer #1 
Communication Buffer #2 


Communication Buffer #N-1 
Communication Buffer #N 


Code and Data for Subsystem 


FIGURE 3.5 Memory map for communication subsystem. 


Conclusion 

This brief overview of how to build a 68000 system only scratches the 
surface of the many possibilities available with the 68000 family. For 
additional information, refer to the data sheets for the individual devices. In 
addition to detailed programming and design information, they usually include 
typical system interconnections and architectures for each device. 
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Bringing Up 
the 68000: 
A First Step 


Alan D. Wilcox 


The technique described here will get a 68000 processor 
working very quickly. By breaking the normally closed 
loop between the 68000 and memory, you can force the 
68000 to execute an infinite chain of NIL instructions, 
making it possible to set up and debug the lowest levels of 
hardware without installing memory or decode logic. This 
chapter is adapted by permission of the publisher, Prentice- 
Hall, Inc., from 68000 Microcomputer Systems: Designing 
and Troubleshooting by Alan D. Wilcox. No part of this 
adaptation may be reproduced, in any form or by any 
means, without written permission from the publisher. 
This chapter originally appeared in Dr. Dobb's Journal 

of Software Tools, January 1986. 


T here are two ways to bring up a new 68000 microcomputer system: the 
hard way and the easy way. The hard way is to use the traditional 
approach of designing the hardware and then using a development system 
along with test software and some in-circuit emulation. Given enough hours 
of testing and correcting problems, the 68000 system has a good chance of 
running successfully. In contrast, the easy way is to design, build and test the 
hardware, module by module, using the 68000 as a free-running processor. 

The impact of the free-running technique on hardware development is quite 
startling. The 68000 kernel shown in Figure 4.1 can be made to run so easily 
that a logic probe can test it. You don't need to use sophisticated digital 
development tools such as a logic analyzer or a development system with an 
in-circuit emulator. If troubleshooting is necessary, you need only a common 
dual-trace oscilloscope. 
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FIGURE 4.1 The 68000 kernel is the essential hardware for 
program execution. 
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FIGURE 4.2 To free run the 68000, the normal feedback path from 
memory is disconnected and a NIL (do-nothing) instruction 
is substituted. 
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Free running the 68000 means that the processor is allowed to execute a 
do-nothing instruction continually. That is accomplished by breaking the 
normally closed loop between the 68000 and its memory, as shown in Figure 
4.2. Instead of carrying program instructions from memory, a one-word 
instruction (call it a NIL instruction) can be jammed onto the data bus. (The 
mnemonic NIL is not part of the 68000 instruction set per se; I coined it as a 
simple expression of the instruction used for free running.) Thus when the 
68000 reads the data bus for an instruction, it fetches the NIL word, executes 
it, increments the address, and reads the next NIL. This cycling repeats over 
the entire 16-megabyte address range; when the processor reaches the end of 
the 16 megabytes, it simply starts over again. 

The strategy for bringing up the 68000 is to design, build and test the 
68000 kernel shown in Figure 4.2. Next, design and build an additional 
module, connect it to the kernel, and test it while the processor is free 
running. Add yet another module and test it while free running. You can free 
run the 68000 all the way through the construction of a complete CPU board. 
In fact, if a processor board fails, you can usually free run it to help speed 
troubleshooting. The only part of the system that you cannot test easily 
while it is free running is the data bus itself, because the NIL instruction is 
forced on the bus. 


Up by the Bootstraps 

Several steps are involved in bringing up the 68000 using the free-running 
technique. The following describes the necessary first step: how to get the 
kernel running. Once the kernel is operating, the rest is fairly straightforward. 
The following steps provide a brief overview of the entire scenario for 
completing a working 68000 CPU board. 

1. Bring up the kernel. Design, build and test the power system, the 68000 
clock and drivers, the reset and halt module, and the 68000 module. 

2. Add a wait state and data transfer acknowledge (DTACK*) generator 
module. 

3. Add RAM and EPROM decoding circuits, connect address and control 
bus circuits. 

4. Write a simple looping program for a pair of EPROMS. Remove the 
NIL instruction and close the broken loop between the EPROMS and the 
68000 data bus. The processor should now be able to read its stack and 
program counter vectors from the EPROMS and execute the loop program. 

5. Add the RAM connections to the data bus. If the 68000 is still running 
successfully with the simple loop program, add more code to include reading 
and writing RAM. If the code is a very tight loop, an oscilloscope will 
synchronize easily and you can use it to check the timing of the various 
control lines to all the memory in the system so far. 

6. Modify the reset and EPROM-control circuits so that the EPROMs do 
not have to be decoded at address 0 except during reset. Normally, the low 
memory addresses should be RAM so that exception vectors can be altered 
dynamically; EPROMs should be elsewhere. 
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7. If at least 4K of RAM has been decoded starting at address 8, and the 
EPROMs are decoded for 0 to 7 and $8000 to $BFFF, then the system can 
use the Motorola Tutor EPROM set. When you restart the system, assuming 
that it does not halt, you can use an oscilloscope to see the activity on the 
various processor lines while the monitor waits for a console keystroke. 

8. Add a 6850 ACIA decoded at $010040 to serve as a console port and test 
it with the Tutor EPROM set. Run various memory-testing commands and 
scope loops to check operation of the new system. 


A Kernel With No Memory 

As stated earlier, the first step is to bring up the kernel in the free-running 
mode. It seems a bit overwhelming when you first try it, but it really is quite 
simple. Unless you made a wiring error, the 68000 is virtually guaranteed to 
come alive and begin executing the NIL instruction. To bring up the kernel, 
you need the power system, the clock and drivers, the reset and halt module, 
and the 68000 module. 

The size of the power supply depends on the nature of the system and what 
loading it will have in the final configuration. In my case, I intended to use 
the 68000 processor board in an IEEE-696 (S-100) system, so I needed on- 
card regulation from an 8-volt system supply. A common 7805 circuit was 
adequate for the processor and its RAM and EPROMs; I used a second 7805 
circuit for the rest of the LS-TTL logic on the CPU board. 

Watch the Motorola data manual for footnotes. In the case of the 68000, 
although the data indicates a power requirement of 300 mA or so, that is not 
the whole story. The fine print at the bottom of one page casually mentions 
that the 68000 might require a peak current of some 1.5 A. Make sure the 
power supply can handle the peak current without falling out of regulation. 
Likewise, power and ground leads to the 68000 need to be heavy, say #24 
wire rather than #30 wire-wrap wires. Locate bypass capacitors close to each 
of the power connections. 

You can design and build the clock circuit using a crystal, some resistors 
and capacitors, and a 7404 or similar, but it's hardly worth the effort. For 
prototype work, being able to change the clock frequency easily without 
redesigning the circuit is a distinct advantage, so using a DIP oscillator is 
appropriate. I used a 6 MHz oscillator in my S-100 prototype to keep within 
the bus specification; only after I had finished the system did I run it up to 10 
MHz and later to 12 MHz. 

Also, you will use both the clock signal and its complement in the final 
circuit design. The complement clock could be derived from a 74LS04, but 
that would introduce a skew between the two clocks of some 10 to 15 ns, 
depending on loading. Although this amount of skew seems slight, it can 
cause severe timing difficulties when the clock speed gets above 10 MHz. The 
74265 quad complementary-output element with a worst-case skew of 3 ns is 
a good selection; in my 12 MHz prototype, this selection has worked out 
well. 
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The reset and halt module has two basic functions. One task is to hold the 
68000 HALT* and RESET* lines low for at least 100 ms on power-up. Its 
other function is to pull the same two lines low for at least ten clock cycles 
for a reset button press at any time after the power has been on. 

The circuit in Figure 4.3 provides a simple and reliable reset function for 
the 68000. It provides a reset pulse either on power-up or whenever you press 
the reset button. Open-collector devices are required because the 68000 
HALT* and RESET* controls are both bidirectional. For example, the 68000 
can itself drive the RESET* line to reset any peripherals if the software 
RESET instruction is issued. Also, the 68000 can force the HALT* line low 
if the system cannot continue processing. A single "halt" LED connected as 
shown is valuable in helping bring up the processor for the first time. 

The last module in the minimum system is the 68000 processor shown in 
Figure 4.4. By now, you should have checked the power, clock and reset 
modules for proper operation and connected them ready for the 68000. If the 
processor is wired as shown, it should begin free running immediately. On 
power-up the HALT light should flash briefly, and then the TEST light will 
begin flashing on and off. 

Earlier I referred to my so-called NIL instruction. As you see in the circuit, 
the data bus is completely grounded so the NIL has an opcode of 0000. In the 
context of its use in free running, it acts like a no-operation or NOP. The 
68000 does have a NOP opcode ($4E71), but this NOP will not work as a 
free-running instruction because a critical constraint on the opcode precludes 
using the NOP instruction in free running: Whatever is wired to the data bus 
for the 68000 to read upon reset must be even. The reset sequence is this: The 
68000 will do four 16-bit reads to get the initial SSP and PC vectors; then it 
will fetch its first opcode at the address in the PC. If the PC is not aligned on 
an even address, the 68000 detects an address error and immediately begins 
illegal-address exception processing. It tries to push its status on the stack at 
the beginning of the exception, but the stack is also an illegal address (the 
same noneven number as in the PC). The result is the fatal double bus fault 
that stops all processing and asserts the HALT* output. 

The opcode 0000 does in fact correspond to a real instruction in the 68000 
set. It is the mnemonic ORI.B #0,D0, and it was selected for free running for 
two reasons: because it was even and because connecting all grounds to the 
data bus was simpler than making sure one or two data lines had a logic 1 on 
them. When the instruction is considered in its free-running environment, the 
appearance of its memory is as shown in Table 4.1. 

You can easily calculate the execution time of this "program." Each 
instruction takes eight clock cycles (two read bus cycles), so for a 6 MHz 
clock, the execution time is 8 times 167 ns, or approximately 1.33 
microseconds. A complete sweep through the entire 16 megabytes of the 
68000 address range takes 1.33 times 4 megabytes, or about 5.59 seconds. If 
you connect the TEST light to the top address bit, A23, it will be on for 2.8 
seconds and then off for 2.8 seconds. I connected the TEST light to A20 
permanently. It stays on for 0.35 seconds and off for 0.35 seconds—a 
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FIGURE 4.3 Circuit diagram of a simple power-up and reset timer 
circuit for a 68000 processor. Note the use of open-collector 
devices on the bidirectional HALT* and RESET* controls of 

the 68000. 
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FIGURE 4.4 Circuit diagram of the minimum 68000 system 
for free-run test. 
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Address Data Program 

00 0000 0000 0000 ORI.B #0,D0 
00 0004 0000 0000 ORI.B #0,D0 
00 0008 0000 0000 ORI.B #0,D0 
00 000C 0000 0000 ORI.B #0,D0 
00 00 10 0000 0000 ORI.B #0,D0 
FF FFFC 0000 0000 ORI.B #0,D0 


TABLE 4.1 Contents of "memory" for free-running test. 


reassuring flash rate during development work and not nearly as unsettling as 
a constant red HALT light. 


Results 

Figure 4.5 shows the performance of the power-up timer circuit. The top 
plot is the main system power as it comes on and eventually regulates at 5 
volts on the CPU board. About 175 ms after the supply voltage is valid, the 
RESET* and HALT™ lines go high to successfully complete a full 68000 
reset. 

The effect of this reset operation is shown in Figure 4.6. The last two 
lines on the timing diagram are the RESET* and HALT* controls for the 
68000. After its internal start-up time, the 68000 asserts its address strobe 
(AS*) and its data strobe lines (UDS* and LDS*) in the first read bus cycle. 
After four read bus cycles, the processor PC begins execution at address 0, as 
discussed above. 

DTACK* is the asynchronous bus control line that normally comes back 
from memory or peripherals to tell the 68000 to complete the current bus 
cycle. During the initial free run of the processor, there is nothing connected 
that will acknowledge a data transfer, so the control is grounded. The timing 
diagram shows this line at a logic low. The timing diagram also shows the 
read/write control, R/W*, as constantly high because the 68000 only does a 
read bus cycle when it is free running. 

Figure 4.7 shows the free-running processor with a DTACK* generator in 
operation. Notice the o and x markers bracketing a single bus cycle. The 
normal read bus cycle has a total of four clock cycles. If DTACK* is delayed 
for two cycles, as shown in Figure 4.8, then the bus cycle is lengthened and 
two "waits" are inserted into each bus cycle. When you interface memory or 
peripherals to the 68000, you can design each external module to hold back 
DTACK* until its unique timing requirements are met. 

As an example, the timing diagram in Figure 4.9 shows the system while 
not free running: It is executing the monitor program (Tutor) and waiting for 
a keystroke. The system was set to provide eight waits for I/O read 
operations, nine for writes, and three waits otherwise. Figure 4.10 shows a 
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FIGURE 4.5 Power-up performance of the 555 timer circuit. On 
power-up, the 555 timer with the parts given in the schematic 
provides about 175 ms RESET* to the 68000. 
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FIGURE 4.6 Typical free run starting from a complete RESET* and 
HALT* asserted low. The clock is running at 6 MHz. DTACK" is 
grounded in this example. 
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FIGURE 4.7 Typical free run with the DTACK* circuit enabled. The 
clock is 6 MHz and there are no wait states inserted. 
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FIGURE 4.8 Typical free run with DTACK* enabled. This timing 
shows DTACK* delayed enough to cause two waits in each bus 
cycle. 
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FIGURE 4.9 A view of the bus activity when the Tutor EPROM set 
runs at 6 MHz. The CPU board was set for eight waits on I/O and 
three waits otherwise. 
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FIGURE 4.10 A closer look at the bus controls when Tutor executes 
a write bus cycle. 
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close-up of the bus cycles. The lower timing line, marked sM1, is the S-100 
bus status indicating an opcode fetch. 


Summary 

Bringing up the 68000 using the free-running technique is very different 
from the more tradition~.i approaches to getting a processor running. You can 
see, though, just by the brief description of this first step in bringing up the 
68000 kernel, that you do not need sophisticated equipment to get started. If 
you have a little hardware experience, this is the ideal way to get started with 
your own homebrew system. 

Of course, getting a free-running 68000 going is only the first step. You'll 
need a system monitor, preferably in ROM or EPROM, and eventually you'll 
probably want to run one of the development-oriented operating systems. 
Nonetheless, the understanding obtained from building a free-running 68000 
is very valuable and can help you go on to design and build a complete 
system successfully. 

Note: See the end of Chapter 5, "Motorola's Tutor Firmware," for avail- 
ability information. 
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Motorola's 
Tutor Firmware 


Alan D. Wilcox 


The set of two ROM chips described here is a useful tool 
for anyone designing a new 68000 system. This chapter is 
adapted by permission of the publisher, Prentice-Hall, Inc., 

from 68000 Microcomputer Systems: Designing and 
Troubleshooting by Alan D. Wilcox. No part of this 
adaptation may be reproduced, in any form or by any means, 
without written permission from the publisher. 


O ne of the first problems a hardware designer has with a new 68000 
computer system is software. After designing and building all the 
hardware, making the system truly useful can be a major crisis, especially if 
you lack 68000 programming experience. It would be helpful to have a ready- 
made monitor that could be implemented immediately and could also be used 
to learn 68000 programming. 

The Motorola Tutor firmware is a 16K monitor program that is currently 
supplied with the 68000 Educational Computer Board (ECB).The purpose of 
the ECB is to provide individual training and education in programming and 
in hardware application of the 68000. Priced between $400 and $500, it is one 
of the best bargains available for 68000 experimentation. The Tutor firmware 
on the ECB is in two 24-pin ROMs comparable to MCM68764 EPROMs. 


The Monitor 


By itself, a monitor programmed in a pair of 8K-by-8 ROMs really is not 
all that exciting. After all, a monitor is a monitor, and they all do the same 
general functions: display registers, set breakpoints, trace programs, see and 
change memory, and so on. Tutor is more than that: It has the capability to 
do in-line assembly and disassembly. That is, it can do line-by-line interactive 
assembly directly from the standard Motorola assembly mnemonics. Each line 
of source code typed at the terminal is directly translated into the proper 
machine-language code and stored in memory for execution. 
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For the beginning 68000 programmer, this immediate interactive assembly 
of code makes software not only easier, but fun as well. Rather than working 
through the usual time-consuming process of assemble, link, load and 
execute, you can check a program for correctness at each step. Some 
instructions seem to intentionally defy understanding, and the interactive 
nature of the Tutor monitor makes it easy to try an instruction and see what it 
does. 

Consider the contrast between Tutor and a disk-based debugger. With Tutor 
in ROM, a computer system can be used without having a disk operating 
system. For example, I normally run my system console through an ECB; 
that is, my console connects to one ECB port and the other ECB port goes 
directly to my CompuPro System 8/16. When I want to check a sequence of 
68000 code, I type Ctrl-@ to tell the ECB to communicate directly with me 
at the console; when done, I type TM to put the ECB in "transparent mode" 
and reconnect to the main system. If I just want to experiment with 68000 
code, I need not even turn 6n the main system. 


Into Another System 

In addition to being valuable as a program-learning tool, the Tutor 
firmware can be used in systems other than the ECB. The code is completely 
position independent: it can be decoded anywhere in memory and still be 
perfectly functional. Consequently, the firmware can be installed in any other 
68000 CPU board and used with only a couple minor modifications. One 
change is to the first 8 ROM locations for the supervisory stack and initial 
program counter vectors, and the other is to the memory location of the I/O 
ports. Physically, the Tutor code should be placed in a pair of 27128s because 
they are more reasonably priced than MCM68764s and also because you 
might want to add more code. For example, I added extra code for booting 
CP/M-68K using Tutor. 

The memory map (Figure 5.1) shows that the Tutor code is located from 
8000 to BFFF hex; the ECB RAM is located between 8 and 7FFF hex. The 
68230 Parallel Interface/Timer (PI/T) is decoded at 10000 hex, and the two 
6850 ACIAs starting at 10040 hex. If you have a new 68000 CPU board with 
a memory and I/O layout exactly like the ECB, then you can duplicate the 
Tutor code into a pair of 27128s without making any changes. More than 
likely, your CPU and system memory configuration is entirely different. For 
exa' ‘ple, my own 10 MHz 68000 CPU board is installed in the CompuPro 
cabn.2: with 128K static RAM, a Disk-1A controller and an I/O board with 
duai-68B50s. (See Figure 5.2.) 

The only changes made to Tutor so that it would run as a monitor for my 
68000 CPU board in the new system are shown in Table 5.1. All the other 
Tutor RAM locations were left just as in the ECB. For example, although 
my system has a 128K RAM starting at address 0, there is absolutely no 
difficulty in letting Tutor use the lower 2K for scratch RAM. Besides, the 
low memory area is used for 68000 exception vectors anyway, so there is 
really no point in modifying anything at all in low memory. 


$00 0000 


00 0004 
00 0007 
68000 Exception Vectors 
00 0400 
Tutor RAM 
00 0900 
User RAM 
00 8000 
Tutor ROMs 
(16KX8) 
00 BFFF 
00 Co00 
00 FFFF — 
01 0000 
01 003F 
ACIA1 (Upper Byte) 
01 0040 
ACIA2 (Lower Byte) 
01 0042 


FIGURE 5.1 Educational Computer Board memory map. (Source: 
MEX68KECB/D2 manual.) 


SSP Vector 


$00 0000 


SSP Vector 


00 0004 
00 0007 
68000 Exception Vectors 
00 0400 
Tutor RAM 
128K 
00 0900 
User RAM 
01 FFFF — 
FD 0000 
S-bug (16KX8), 
DOS Boot, etc. (16KX8) 
FD 8000 64K 
On-Board Local RAM or EPROMs 
FD FFFF 
FF 0000 
ACIA1 (FF 0040) 
ACIA2 (FF 0041) 
All Other 1/0 
FF FFFF 


FIGURE 5.2 Memory map of the author's 10 MHz 68000 computer 
system, which uses the Motorola Tutor firmware as the monitor. 
The modified code is referred to as S-bug. 
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Old Value New Value 
Program Counter 000081 46 00 FD 01 46 
ACIA#1 00 01 00 40 00 FF 00 40 
ACIA#2 00 01 00 41 00 FF 00 41 


Prompt Message TUTOR1.3 S-bug 1.3 
TABLE 5.1 Changes to Tutor for anew CPU board. 


Some Commands 

Table 5.2 displays a summary of the Tutor commands. I will not describe 
them here; I prefer to look at them as a new programmer or a hardware 
engineer might. Are they useful in helping to learn 68000 assembly 
programming? From my perspective as a hardware designer, are they useful in 
helping me to test and debug a new 68000 system? 

Look over the list as if you were just learning the assembly lanuguage. 
What would be most helpful? Certainly being able to display and change 
memory is a Start; you can experiment with the 68000 MOVE instructions 
and see how they change memory. Likewise, the register display feature is 
just as useful. Rather than printing the registers all crammed into a couple 
lines, the display shows the registers clearly so you can see the action easily. 
The various other commands, such as breakpoints and tracing, are also quite 
useful. 

The real bonus with the Tutor firmware is the interactive assembler 
capability. Being able to try an instruction and get immediate results is a 
definite plus in the learning process. Rather than study an instruction at great 
length, my approach is to try it out and see if it does what the book seems to 
say it should. Not being an expert assembly language programmer has an 
advantage: looking from the bottom up, I can see what is helpful in learning. 


Experience With Tutor 

Figure 5.3 shows a simple program to add registers dO and d1 and then 
return control to Tutor by using the TRAP 14 function. The program was 
entered interactively line-by-line by typing the standard mnemonics. After 
setting the PC, dO and dl, a DF command displayed the contents of all the 
registers. The GO caused execution of the program beginning at the $1000 set 
in the program counter. Another DF shows that the dl register changed. A 
quick match check with the DC command shows that the answer in d1 is 
correct. For me, I find this approach to learning the 68000 convenient and 
quick. If I write a large program, I can upload it to my host system and save 
it on disk. 

Is it a useful monitor for hardware development? Yes, most definitely! I 
used it during the design and testing of my 68000 CPU board and it was 
extremely useful. The first programs that a new processor board runs, of 
course, are the simple ones like reading a single memory location over and 
over in a tight loop. After burning a few EPROMs with simple code like this 
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to check memory and I/O, you eventually get to the point where you need a 
good monitor program. The Tutor firmware worked fine to write more 
complex test programs before the disk operating system was in place. 

The memory test program is one of the more useful built-in Tutor 
routines. When memory is being added to a new system, this test program 
can check for problems. The other memory commands, such as move and 
search, can help in the development process. For example, the memory search 
routine can be used to find key code in the monitor itself or in programs 
being temporarily modified for testing. 

Overall, Tutor has been an extremely helpful monitor ready to do useful 
work during the development of a new 68000 computer system. It can be 
applied effectively by a novice 68000 programmer with very little prior 
experience with the processor. It can be easily modified and used directly as 
the permanent system monitor to test system operation or to initiate code to 
boot a disk operating system. 


TUTOR 1.3 > MM 1000;DI 


001000 00000000 OR.B #0,D0 ? ADD DO,D1 Interactive 
001002 00000000 OR.B #0,D0 ? MOVE.B #228,D7 entry of 
001006 00000000 OR.B #0,D0 ? TRAP #14 program 
001008 00000000 OR.B #0,D0 ? 

TUTOR 1.3 > .PC 1000 Set program counter 

TUTOR 1.3 > .DO 1234 Set data register DO 


TUTOR 1.3 > .DL 24578 Set data register Dl 


TUTOR 1.3 > DE 

PC=00001000 SR=2708=.S7.N.. US=00005000 SS=00000780 Display the 
D0=00001234 D1=00024578 D2=00000000 D3=00000000 registers 
D4=00000000 D5=00000000 D6=00000000 D7=000000E4 

A0=00010040 A1=FFFFFFFF A2=00000414 A3=00000554 

A4=00009FB2 A5=00000540 A6=00000540 A7=00000780 


= 2-5-5222 ----------- 001000 D240 ADD.W  DO,D1 

TUTOR 1.3 > GO Execute the 

PHYSICAL ADDRESS=00001000 program at 
$1000 

TUTOR 1.3 > DE 

PC=00001002 SR=2700=.S7..... US=00005000 SS=00000780 

DO0=00001234 D1=000257AC D2=00000000 D3=00000000 Registers 

D4=00000000 D5=00000000 D6=00000000 D7=000000E4 afterwards 


A0=00010040 A1=FFFFFFFF A2=00000414 A3=00000554 
A4=00009FB2 A5=00000540 A6=00000540 A7=00000780 


Ree A SSS SS SS 001002 1E3C0O0E4 MOVE.B #228,D7 

TUTOR 1.3 > DC $1234+$24578 

$257AC=&153516 Check answer 
using data 

TUTOR 1.3 > conversion 
command. 


FIGURE 5.3 A simple ADD-registers program entered interactively. 
After the addition, the MOVE and TRAP return control to Tutor. 
All user entries are underlined; comments are in bolaface. 
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Command Description 
MD Memory display in hex/ASCIlI or disassembled 
MM Memory modify in hex, ASCII or interactively assemble 
MS Memory set 
DF Display formatted registers 
AO - .A7 Display and set address registers 
.DO - .D7 Display and set data registers 
.PC Display and set program counter 
SR Display and set status register 
SS Display and set supervisory stack pointer 
.US Display and set user stack pointer 
BF Block fill memory with value 
BM Block move memory 
BT Block test segment of memory 
BS Block search memory for hex or string 
DC Data onversion for on-screen math 
HE Help with available commands 
BR Breakpoint set 
NOBR Remove breakpoint 
GO, G Go ahead with execution of program 
GT Go until breakpoint 
GD Go direct; like GO but no initial trace 
TR, T Trace an instruction 
TT Trace with temporary breakpoint 
DU Dump memory to host in S-record format 
LO Load S-records from host 
VE Verify memory with S-records from host 
T™ Transparent mode to pass data from port 1 to port 2 
a‘ Send message following "*" to port 2 
PA Printer attach 
NOPA Reset printer attach 
PF Port format: set nulls 
OF Display offsets for relocating code 
-RO - R6 Display and set relative offset registers 


TABLE 5.2 A summary of the Tutor command set. 
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Availability 

The following Tutor firmware and related materials are available: 

¢ A pair of MCM68A364 ROMs, as in the Educational Computer Board 
(ECB): $100. Order part 51AW4129B24 (odd ROM) or 51AW4129B25 (even 
ROM) from: 


Motorola Communications and Electronics, Inc. 
Phoenix Repair Depot 

1711 W. 17th St. 

Tempe, AZ 85821 

602/994-6472 


¢ The license and source code listing of Tutor (except for the one-line 
assembler source): $125. Order M68KTUTOR-D4 from any Motorola dis- 
tributor. No ROMs included. 

The license and source code listing of Tutor on VERSAdos disk: $400. 
Order M68KTUTORS from any Motorola distributor. 

¢ The Tutor instructions and ECB hardware documentation: $6.75. Order 
MEX68KECB/D?2 from any Motorola distributor. 

¢ The whole ECB with Tutor ROMs: $495. Order MEX68KECB from any 
Motorola distributor. 

Note: If you are affiliated with a school or university, information about 
donations and grants is available from: 


Motorola Technical Operations 
University Support Program, HW-68 
P.O. Box 2953 

Phoenix, AZ 85062 

602/244-6777 
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Tiny BASIC 


Gordon Brandly 


Dr. Dobb's Journal was the first magazine 

to publish the source code for a Tiny BASIC interpreter. 
Here we present Tiny BASIC for the 68000, translated from 
the original 8080 version (and significantly improved). 


| deaccantag the good old days? When the 8080 microprocessor reigned 
supreme, 8K of memory cost an arm and a leg, ah yes. . . . The years 
went by, microcomputers got bigger, software grew more sophisticated and 
prices went up. That is just fine, of course, if you can afford the prices. The 
less fortunate among us, however, must build or buy smaller 16-bit 
“educational” systems. That is fine too—if you don't mind having hardly any 
software. 

This is just the sort of situation that gave rise to Dr. Dobb's Journal in the 
“good old days." The solution back then was to publish a Tiny BASIC 
interpreter that could be adapted to just about any 8080 microcomputer 
around. This solution worked fabulously and gave many a hobby computer its 
first taste of useful software. Well, if the solution worked once, why not 
again? I therefore decided to produce a Tiny BASIC interpreter for the 
relatively small 68000 systems, such as the Motorola Education Computer 
Board (ECB), the EMS 68000 board and so on. 

To produce this BASIC, I took one of the most successful 8080 Tiny 
BASICs—Li Chen Wang's Palo Alto Tiny BASIC (Dr. Dobb’s Journal, May 
1976)—and translated it into 68000 code. I then added a few features and 
optimized the code a little, producing a surprisingly usable interpreter. 

Those who know the original Palo Alto Tiny BASIC (or the Sherry 
Brother's version on CP/M User's Group Volume 11) will find this interpreter 
very similar. I have made two or three changes to the interpreter's syntax to 
bring it closer to the de facto Microsoft standard. The colon is used instead of 
the semicolon to separate multiple statements on a line. The inequality 
operator (#) has been changed to the more standard < >. I also added the 
PEEK, POKE, CALL, BYE, LOAD and SAVE commands, which are 
described later. 

Those of you used to a bigger BASIC, such as the various Microsoft 
interpreters, will find that this version works almost the same within its 
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limitations. Following are some excerpts from Li Chen Wang's original 
documentation, mixed with descriptions of my extensions. 


The Language 

¢ Numbers. In this version of Tiny BASIC, all numbers are 32-bit 
integers and must be in the range 2,147,483,647 to - 2,147,483,648. I decided 
to use 32 bits so that the PEEK and POKE commands could access the entire 
address range of the 68000. This slows down arithmetic operations somewhat, 
but sticking to 16 bits would have produced unnecessary complications. 

¢ Variables. There are 26 variables, denoted by the letters A through Z, 
and a single array @(I). The dimension of this array (i.e., the range of value 
of the index I) is set automatically to make use of all the memory space that 
is left unused by the program (i.e., 0 through SIZE/4; see the SIZE function 
below). All variables and array elements are 4 bytes long. 

¢ Functions. There are four functions: 

ABS(X) gives the absolute value of X. 

RND(X) gives a random number between | and X (inclusive). 

SIZE gives the number of bytes left unused by the program. 

PEEK(X) gives the value of the byte at memory location X. 

¢ Commands. The LET command LET A=234-5*6,A=A/2, X=A- 
100,@(X + 9)=A-1 will set the variable A to the value of the expression 234- 
5*6 (or 204), set the variable A (again) to the value of the expression A/2 (or 
102), set the variable X to the value of the expression A-100 (or 2), and then 
set the variable @(11) to 101 (where 11 is the value of the expression X + 9 
and 101 is the value of the expression A-1). 

The PRINT command PRINT A*3+1, "abc 123 !@#", ‘cba’ will print the 
value of the expression A*3+1 (or 307), the string of characters abc 123 !@# 
and the string cba, and then a CR-LF (carriage return and line feed). Note that 
you can use either single or double quotes to quote strings, but pairs must 
match. If a comma appears at the end of the PRINT command, the final CR- 
LF will not be printed. Note also that commas separate adjacent items (most 
other BASICs use the semicolon to perform this function). 

You can use the '$' character to send control codes to your terminal. This 
is usually used to move the terminal's cursor around. For instance: 


PRINT $27, '=',$32+Y,$32+X 


will move the cursor to the position indicated by the x and y variables on 
most Lear-Siegler and TeleVideo terminals. The number following the $ sign 
is the control code's decimal value. For example, $27 sends an ASCII 
ESCape character. 

The command PRINT A, B, #3, C, D, E, #10, F, G will print the values 
of A and B in 11 spaces; the values of C, D, and E in 3 spaces; and the values 
of F and G in 10 spaces. The value will be printed in full even if there aren't 
enough spaces specified for it. 
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The command PRINT '‘abc',_,"XXX' will print the string abc, a CR 
without an LF, and then the string xxx (over the abc) followed by a CR-LF. 

The INPUT command INPUT A,B will cause Tiny BASIC to print A: and 
wait to read in an expression from the console. The variable A will be set to 
the value of this expression. Then B: will be printed and variable B set to the 
value of the next expression entered. Note that you can enter complete 
expressions as well as numbers. This enables an interesting trick: you can set 
the variable Y to an unusual value (e.g., 9999) and use it to get the answer to 
a yeS-or-no question, such as: 


10 Y=9999 :INPUT ‘Are you sleepy?'A :IF A=Y GOTO 100 


The user can answer the question with the expression Y, which puts the 
numeric value of Y into the A variable. 

INPUT 'What is the weight'A, "and size"B is the same as the first INPUT 
example except that the prompt A: is replaced by "What is the weight:" and 
the prompt B: is replaced by "and size:". Again, you can use either single or 
double quotes as long as they match. 

INPUT A, 'string’,_, "another string", B with the strings and the _ has the 
same effect as in PRINT. 

The POKE command POKE 4000+X,Y puts the value produced by 
expression Y into the byte memory location specifed by the expression 
4000+X. 

The CALL command CALL xx will call a machine language subroutine at 
the address specified by the expression x. All of the CPU's registers except 
the stack pointer can be used in the subroutine. 

The BYE command will return control to the resident monitor program or 
operating system. 

The SAVE command will save your BASIC program on the storage device 
you provide. Details on installing this device are given in the source code. As 
set up for the Educational Computer Board, this command will send the 
program out to the host computer in an easily stored text form. This isn't, 
however, human-readable program text because the line numbers are 
represented in hexadecimal. 

The LOAD command will delete the program in memory and load in a 
program from your storage device. 


Stopping the Program 

You can stop the execution of the program or listing of the program by 
pressing Ctrl-C on the console. Additionally, you can pause in a program 
listing by pressing Ctrl-S and then pressing any key to continue. 


Abbreviations and Blanks 


You may use blanks freely within a program except that numbers, 
command keywords, and function names cannot have embedded blanks. 
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Command Abbreviation 
A. ABS 

C. CALL 

F. FOR 
GOS. GOSUB 
G. GOTO 
IF. IF 

| INPUT 

L. LIST 

LO. LOAD 
NEW 
NEXT 
PEEK 
POKE 
PRINT 
REMARK 
RETURN 
RND 
RUN 
SAVE 
SIZE 
STEP 
STOP 
TO 

no keyword LET 


m ie) 
< ‘ 


NDNHHHODDDVDVVVAS 


| 
) 


TABLE 6.1 Command abbreviations. 


You may abbreviate all command keywords and function names, following 
each by a period. For instance, P., PR., PRI. and PRIN. all stand for PRINT. 
You may also omit the word LET in LET commands. The shortest 
abbreviations for all keywords are given in Table 6.1. 

Note that, in some cases, the same abbreviation applies to different 
keywords. The interpreter is "smart" enough to identify the correct keyword 
for a particular situation. For instance, if the abbreviation P. appears at the 
beginning of a line, it can only mean PRINT. In a statement such as 
A=P.(8), the P. only makes sense if it stands for PEEK. 


Error Messages 

There are only three error conditions in Tiny BASIC. The line containing 
the error is printed out with a question mark inserted at the point at which the 
error is detected. 

"What?" indicates an error in a statement's syntax: 


What? 
260 LET A=B+3, C=(3+4?.X=4 
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¢ "How?" means that the statement in question is syntactically correct, but 
for some reason the command can't be carried out: 


How? 
310 LET A=B*C?+2 


How? 
380 GOTO 412? 


In the first example, B*C is greater than 2,147,483,647. In the second 
example, there is no program line 412. 

* "Sorry." means that the interpreter understands the statement and knows 
how to do it but lacks sufficient memory to accomplish the task. 

If you notice an error in your entry before you press RETURN, you can 
delete characters with the backspace (Ctrl-H) key or delete the entire line with 
Ctrl-X. To delete an existing program line, just type the line number and 
press RETURN. 


Installation 

Now, how do you get this program running on your computer? Very 
easily, if you have a setup similar to mine. Installation on other systems 
should also be fairly easy if you have access to a 68000 assembler of some 
kind. 

My setup is a Motorola MEX68KECB Educational Computer Board 
(ECB) connected between my terminal and my CP/M system. The source 
code was assembled with the Quelo version 1.9 public domain 68000 cross- 
assembler for CP/M. (By the way, if you use this assembler, you will get 36 
“trim 16 address" errors, which is normal.) Tiny BASIC is then loaded into 
the ECB and executed at the cold start address of hex 900. 

BASIC programs are saved and loaded by setting up an appropriate CP/M 
command before using SAVE or LOAD. For example (user input is 
underlined). 


After a program is written, 

exit to the monitor: 
> BYE 

Enter transparent mode: 
TUTOR 1.x>IM 

Issue a PIP command to the CP/M host: 
A>PIP PROGRAM.BAS=CON: 

Exit transparent mode and do a BASIC warm start: 
TUTOR 1.x>GO_904 

Do the actual save: 
SAVE 
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The warm start is an entry point into the interpreter that will preserve any 
program you may have already entered. 

Program LOADSs are done similarly, except instead of PIP you must run a 
small program that will wait to receive a carriage return before sending the 
program to the ECB. Here is a sample program in Microsoft BASIC: 


10 INPUT "Program to send?";FS$ 

20 OPEN "I",1,F$ 

30 INPUT "Now exit Transparent Mode and do a LOAD,";Z$ 
40 WHILE NOT EOF(1):LINE INPUT #1,A$:PRINT AS$:WEND 


Admittedly, this way of loading and saving is a fairly complex procedure, 
but it allows you to save your programs on disk while keeping the interpreter 
itself small. If your ECB isn't connected to another computer, you probably 
could change the AUXIN and AUXOUT subroutines to use the cassette 
interface. (I haven't tried it, though, so caveat emptor!) 

For other 68000 systems, you will have to modify only the OUTC, INC, 
AUXOUT, AUXIN, and BYEBYE routines at the end of the interpreter 
program. In addition, you must put the address of the first unavailable 
memory location above BASIC into the location ENDMEM. BASIC 
programs are saved in a form that can be stored as ASCII text and read back 
quickly by the 68000; if your storage device can't handle the present format or 
if you would like the program saved in a human-readable form, you need 
modify only the SAVE and LOAD subroutines. 

One warning: I wrote the DIRECT and EXEC routines assuming that the 
interpreter itself would be somewhere in the first 64K of memory ($0 to 
$FFFF). If you move it above 64K, you will have to modify the EXEC 
routine and check the rest of the code carefully to make sure the addressing 
modes are correct. 


Evaluation 

I am quite pleased with how the interpreter turned out. Even though I added 
extra error checking, lowercase conversion, and more commands and extended 
the variable size to 32 bits, the whole thing still fits inside 3K of memory. I 
ran the Sieve of Eratosthenes benchmark program on this interpreter and on 
the Sherry Brother's CP/M Tiny BASIC with the following results: 


68000 at 4 MHz Z80 at 4 MHz 
2670 seconds 3000 seconds 


Although I adjusted the results for the usual ten iterations of the basic 
algorithm, I actually ran the program only for one iteration to keep running 
times within a practical limit. This Tiny BASIC may not be a speed demon, 
but it does beat Applesoft and PET BASIC at running the Sieve benchmark. I 
should add that I compressed the Sieve program listing to the maximum for 
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speed considerations; I normally use more spaces and some comments so that 
I can figure out later what the program was supposed to do! 


Possible Improvements 

Of course, many improvements can be made given more available 
memory. My Educational Computer Board has 32K of memory, so I probably 
will add such things as more variables, strings, and keyword tokenization. 
The last is a method used by most BASIC interpreters to compress keywords 
such as LET and PRINT into single bytes. This would greatly speed up the 
interpreter while using less memory to store the BASIC program. 

John Byrns pointed out in a letter in the July 1985 issue of Dr. Dobb's 
Journal of Software Tools that the RND function doesn't work very well. The 
original random-number generator is, quite frankly, atrocious (as you can see 
in my comments in the source code). [See Chapter 14, "A Pseudo Random- 
Number Generator" by Michael P. McLaughlin, and Chapter 15, "Generating 
Nonuniform Distributions of Random Numbers" by Chris Crawford, for 
possible solutions to the problem.] 

The letters column in the August 1985 issue of DDJ contains an excellent 
analysis by Robert Grappel of some of the interpreter's workings. He 
suggests many good ways to speed up the interpreter's operation without 
significantly increasing its size. 
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Listing 6.1 


SII IO III I IO IO IOI I IO IO IO IOI II II IO II IIR IOI IK ICR IK IK IKK 


* * 
* Tiny BASIC for the Motorola MC68000 * 
* - 
* * 
* Derived from Palo Alto Tiny BASIC as published in the May 1976 * 
* issue of Dr. Dobb's Journal. Adapted to the 68000 by: mu 
* Gordon Brandly = 
" Apt. C 8710-97th Ave * 
* Edmonton, Alberta * 
* Canada T6C 2Cl1 ie 
* * 
* * 
* This version is for MEX68KECB Educational Computer Board I/O. * 
* * 


FOO IO IO IO IO IO IO IO III OI III I IO IOI III II I III II IOI IK RI I KKK AK IOK 


* Copyright (C) 1984 by Gordon Brandly. This program may be x 
® freely distributed for personal use only. All commercial * 


* rights are reserved. * 
GOR RIO IO IOI OI II IIR OI OI IO III IOI I IO IO II IK II KKK IH 


* Vers.1.0 1984/7/17 - Original version by Gordon Brandly 

* 1.1 1984/12/9 - Addition of '$' print term by Marvin Lipford 

* 1.2 1985/4/9 - Bug fix in multiply routine by Rick Murray 
OPT FRS,BRS forward ref.'s & branches default to short 

CR EQU $OD ASCII equates 

LF EQU SOA 

TAB EQU $09 

CTRLC EQU $03 

CTRLH EQU $08 

CTRLS EQU $13 

CTRLX EQU $18 

BUFLEN EQU 80 length of keyboard input buffer 
ORG $900 first free address using Tutor 

* 

* Standard jump table. You can change these addresses if you are 

* customizing this interpreter for a different environment. 

* 

START BRA.L CSTART Cold Start entry point 

GOWARM BRA.L WSTART Warm Start entry point 

GOOUT BRA.L OUTC Jump to character-out routine 

GOIN BRA.L INC Jump to character-in routine 

GOAUXO BRA.L AUXOUT Jump to auxiliary-out routine 

GOAUXI BRA.L AUXIN Jump to auxiliary-in routine 

GOBYE BRA.L BYEBYE Jump to monitor, DOS, etc. 


* 


* Modifiable system constants: 


TXTBGN DC.L TXT beginning of program memory 
ENDMEM DC.L $8000 end of available memory 


* 
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* The main interpreter starts here: 


x 


CSTART 


WSTART 


ST3 


ST4 


MOVE .L 
LEA 
BSR.L 
MOVE.L 
MOVE.L 
SUB.L 
MOVE .L 
SUB.L 
MOVE .L 
CLR.L 
MOVE .L 
MOVE.L 
MOVE .L 
MOVE.L 
LEA 
BSR.L 
MOVE .B 
BSR.L 
BSR.L 
MOVE .L 
LEA 
BSR.L 
BSR.L 
TST 
BEQ.L 
CMP.L 
BCC.L 
MOVE .B 
ROR 
MOVE .B 
ROL 
BSR.L 
MOVE .L 
BNE 
BSR.L 
MOVE .L 
MOVE .L 
BSR.L 
MOVE .L 
MOVE.L 
SUB.L 
CMP .L 
BEQ 
MOVE .L 
MOVE .L 
ADD.L 
MOVE .L 
CMP .L 
BLS.L 
MOVE.L 
MOVE.L 
MOVE.L 
BSR.L 
MOVE .L 
MOVE.L 
MOVE .L 
BSR.L 
BRA 


ENDMEM, SP 
INITMSG, A6 
PRMESG 
TXTBGN, TXTUNF 
ENDMEM, DO 
#2048,D0 
DO, STKLMT 
#108,D0 
DO, VARBGN 
DO 

DO, LOPVAR 
DO, STKGOS 
DO, CURRNT 
ENDMEM, SP 
OKMSG, A6 
PRMESG 
#'>',DO 
GETLN 
TOUPBUF 
AO,A4 
BUFFER, AO 
TSTNUM 
IGNBLK 

D1 

DIRECT 
#SFFFF,D1 
QHOW 
D1,-(A0) 
#8,D1 
D1,-(A0) 
#8,D1 
ENDLN 
Al,A5 

ST4 
FNDNXT 
A5,A2 
TXTUNF, A3 
MVUP 
A2,TXTUNF 
A4,D0 
A0,DO 
#3,D0 

ST3 
TXTUNF, A3 
A3,A6 

DO, A3 
VARBGN, DO 
A3,D0 
QSORRY 
A3, TXTUNF 
A6,Al 
AS,A2 
MVDOWN 
AO,Al1 
A5,A2 
A4,A3 
MVUP 

ST3 


initialize stack pointer 
tell who we are 


init. end-of-program pointer 
get address of end of memory 
reserve 2K for the stack 


reserve variable area (27 long words) 


initialize internal variables 


current line number pointer = 0 
init S.P. again, just in case 
display "OK" 


Prompt with a '>' and 

read a line. 

convert to upper case 

save pointer to end of line 
point to the beginning of line 
is there a number there? 

skip trailing blanks 

does line no. exist? (or nonzero?) 
if not, it's a direct statement 
see if line no. is <= 16 bits 
if not, we've overflowed 

store the binary line no. 
(Kludge to store a word on a 
possible byte boundary) 


find this line in save area 

save possible line pointer 

if not found, insert 

find the next line (into Al) 
pointer to line to be deleted 
points to top of save area 

move up to delete 

update the end pointer 

calculate the length of new line 


is it just a line no. & CR? 
if so, it was just a delete 
compute new end 


see if there's enough room 


if not, say so 

if so, store new end position 
points to old unfilled area 
points to beginning of move area 
move things out of the way 

set up to do the insertion 


do it 
go back and get another line 
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*** Tables *** DIRECT *** EXEC *** 


This section of the code tests a string against a table. When 
a match is found, control is transferred to the section of 
code according to the table. 


* 
* 
* 
* 
* 
* 
* At 'EXEC', AQ should point to the string, Al should point to 
* the character table, and A2 should point to the execution 

* table. At 'DIRECT', AO should point to the string, Al and 

* A2 will be set up to point to TAB1 and TAB1.1, which are 

* the tables of all direct and statement commands. 

* 

* BR '.' in the string will terminate the test and the partial 
match will be considered as a match, e.g. 'P.', 'PR.','PRI.', 
"PRIN.', or 'PRINT' will all match 'PRINT'. 


+ + * 


There are two tables: the character table and the execution 
table. The character table consists of any number of text items. 
Each item is a string of characters with the last character's 
high bit set to one. The execution table holds a 16-bit 
execution addresses that correspond to each entry in the 


+ 


* 


* character table. 
* 
* The end of the character table is a 0 byte which corresponds 
* to the default routine in the execution table, which is 
* executed if none of the other table items are matched. 
* 
* Character-matching tables: 
TABL DC.B "LIS', ('T'+$80) Direct commands 
DC.B "LOA', ('D'+$80) 
DC.B 'NE', ('W'+$80) 
DC.B "RU', ('N'+$80) 
DC.B "SAV', ('E'+$80) 
TAB2 DC.B "NEX', ('T'+$80) Direct / statement 
DC.B TLE, (TT +980) 
DC.B "TE", CE 4980) 
DC.B "GOT', ('O'+$80) 
DC.B "GosU', ('B'+$80) 
DC.B "RETUR', ('N'+$80) 
DC.B "RE', ('M'+$80) 
DC.B "FO', ('R'+$80) 
DC.B 'INPU', ("T'+$80) 
DC.B "PRIN', ('T'+$80) 
DC.B "POK', ("E'+$80) 
DC.B "STO', ('P'+$80) 
DC.B "BY', ('E'+$80) 
DC.B "CAL', ('L'+$80) 
DC.B 0) 
TAB4 DC.B "PEE', ('K'+S$80) Functions 
DC.B "RN', ('D'+$80) 
DC.B "AB', ('S'+$80) 
DC.B "SIZ", ('E'+$80) 
DC.B 0 
TABS DC.B 'T', ('O'+$80) "TO" ain "FOR" 
DEB 0 
TAB6 DC.B "STE', ('P'+$80) "STEP" in “EOR" 


DC.B 0 
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TAB8 DC.B ">", ('='+$80) Relational operators 
DC.B "<', ('>'+$80) 
DC.B ({'>'+$80) 
DC.B ('="'+$80) 
DC.B "<', ('='+$80) 
DC.B ("'<'+$80) 
DC.B 0 
DC.B 0 <- for aligning on a word boundary 


* Execution address tables: 


TAB1.1 DC.W LIST Direct commands 
DC.W LOAD 
DC.W NEW 
DC.W RUN 
DC.W SAVE 
TAB2.1 DC.W NEXT Direct / statement 
DC.W LET 
DC.W IF 
DC.W GOTO 
DC.W GOSUB 
DC.W RETURN 
DC.W REM 
DC.W FOR 
DC.W INPUT 
DC.W PRINT 
DC.W POKE 
DC .W STOP 
DC.W GOBYE 
DC.W CALL 
DC.W DEFLT 
TAB4 .1 DC.W PEEK Functions 
DC.W RND 
DC.W ABS 
DC.W SIZE 
DC.W XP40 
TABDS.<.<l DC.W FR1 "TO" in "FOR" 
DC.W QWHAT 
TABO6.1 DC.W FR2 "STEP" in "FOR" 
DC.W FR3 
TAB8.1 DC.W XP11 >= Relational operators 
DC.W XP12 <> 
DC.W XP13 > 
DC.W XP15 = 
DC.W XP14 <= 
DC.W XP16 < 
DC.W XP17 
* 
DIRECT LEA TAB1,Al1 
LEA TAB1.1,A2 
EXEC BSR.L IGNBLK ignore leading blanks 
MOVE .L A0,A3 save the pointer 
CLR.B D2 clear match flag 
EXLP MOVE.B (A0)+,D0 get the program character 
MOVE .B (Al) ,D1 get the table character 
BNE EXNGO If end of table, 
MOVE .L A3,A0 restore the text pointer and... 
BRA EXGO execute the default. 
EXNGO MOVE .B DO,D3 Else check for period... 
AND.B D2,D3 and a match. 
CMP .B ees, DS 
BEQ EXGO if so, execute 
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AND .B #S7F,D1 ignore the table's high bit 
CMP .B DO,D1 is there a match? 
BEQ EXMAT 
ADDQ.L #2,A2 if not, try the next entry 
MOVE.L A3,A0 reset the program pointer 
CLR.B D2 sorry, no match 
EX1 TST.B (Al) + get to the end of the entry 
BPL EX1 
BRA EXLP back for more matching 
EXMAT MOVEQ #-1,D2 we've got a match so far 
TST.B (Al) + end of table entry? 
BPL EXLP if not, go back for more 
EXGO LEA 0,A3 execute the appropriate routine 
MOVE (A2) ,A3 
JMP (A3) 


* 
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* What follows is the code to execute direct and statement 

* commands. Control is transferred to these points via the command 
* table lookup code of 'DIRECT' and 'EXEC' in the last section. 

* After the command is executed, control is transferred to other 


* sections as follows: 

* 

* For 'LIST', 'NEW', and 'STOP': go back to the warm start point. 
* For 'RUN': go execute the first stored line if any; else go 

* back to the warm start point. 

* For 'GOTO' and 'GOSUB': go execute the target line. 

* For 'RETURN' and 'NEXT'; go back to saved return line. 

* For all others: if 'CURRNT' is 0, go to warm start; else go 


* execute next command. (This is done in 'FINISH'.) 
KKK KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK KK KKK KKK KKK KR KK KK RK KKK RK RK 
* *** NEW *** STOP *** RUN (& friends) *** GOTO *** 


"NEW<CR>' sets TXTUNF to point to TXTBGN 


‘RUN<XCR>' finds the first stored line, stores its address 
in CURRNT, and starts executing it. Note that only those 
commands in TAB2 are legal for a stored program. 


* 
* 
* 
* 'STOP<CR>' goes back to WSTART 
* 
* 
* 


There are 3 more entries in 'RUN': 

"RUNNXL' finds next line, stores it's address and executes it. 
* 'RUNTSL' stores the address of this line and executes it. 

* '"RUNSML' continues the execution on same line. 


* 'GOTO expr<CR>' evaluates the expression, finds the target 
* line, and jumps to 'RUNTSL' to do it. 


NEW BSR.L ENDCHK 
MOVE.L TXTBGN, TXTUNF set the end pointer 

STOP BSR.L ENDCHK ‘ 
BRA WSTART 

RUN BSR.L ENDCHK 
MOVE .L TXTBGN, AO set pointer to beginning 


MOVE .L AO, CURRNT 
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RUNNXL TST.L CURRNT executing a program? 
BEQ.L WSTART if not, we've finished a direct stat. 
CLR.L D1 else find the next line number 


MOVE.L AO,Al 
BSR.L FNDLNP 


BCS WSTART if we've fallen off the end, stop 
RUNTSL MOVE .L Al, CURRNT set CURRNT to point to the line no. 
MOVE .L A1,A0 set the text pointer to 
ADDQ.L #2,A0 the start of the line text 
RUNSML BSR.L CHKIO see if a control-C was pressed 
LEA TAB2,A1 find command in TAB2 
LEA TAB2.1,A2 
BRA EXEC and execute it 
GOTO BSR.L EXPR evaluate the following expression 
BSR.L ENDCHK must find end of line 
MOVE.L DO,D1 
BSR.L FNDLN find the target line 
BNE.L QHOW no such line no. 
BRA RUNTSL go do it 


* 


Pete ee eee eee Se SC CCS CSS CCC eee ee ee ee 


* kkk LIST *** PRINT *** 


* LIST has two forms: 

* 'LIST<CR>' lists all saved lines 

* "LIST #<CR>' starts listing at the line # 

* Control-S pauses the listing, control-C stops it. 

* 

* PRINT command is 'PRINT ....:' or 'PRINT ....<CR>' 

* where '....' is a list of expressions, formats, back-arrows, 
* and strings. These items a separated by commas. 

* 

* 


A format is a pound sign followed by a number. It controls 
the number of spaces the value of an expression is going to 
be printed in. It stays effective for the rest of the print 
command unless changed by another format. If no format is 
specified, 11 positions will be used. 


+ + 


+ 


A string is quoted in a pair of single- or double-quotes. 
An underline (back-arrow) means generate a <CR> without a <LF> 
A <CR LF> is generated after the entire list has been printed 


or if the list is empty. If the list ends with a semicolon, 
however, no <CR LF> is generated. 


+ + OF F FF OF 


* 


LIST BSR.L TSTNUM see if there's a line no. 
BSR.L ENDCHK if not, we get a zero 
BSR.L FNDLN find this or next line 

LS1 BCS WSTART warm start if we passed the end 
BSR.L PRTLN print the line 
BSR.L CHKIO check for listing halt request 
BEQ LS3 


CMP .B #CTRLS, DO pause the listing? 
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BNE LS3 
LS2 BSR.L CHKIO if so, wait for another keypress 
BEQ LS2 
LS3 BSR.L FNDLNP find the next line 
BRA LSs1 
PRINT MOVE #11,D4 D4 = number of print spaces 
BSR.L TSTC if null list and ":" 
DC.B Vee) .BR2=* 
BSR.L CRLF give CR-LF and continue 
BRA RUNSML execution on the same line 
PR2 BSR.L TSTC if null list and <CR> 
DC.B CR, PRO-* 
BSR.L CRLF also give CR-LF and 
BRA RUNNXL execute the next line 
PRO BSR.L TSTC else is it a format? 
DC.B e' 7PR1I-* 
BSR.L EXPR yes, evaluate expression 
MOVE DO,D4 and save it as print width 
BRA PR3 look for more to print 
PRL BSR.L TSTC is character expression? (MRL) 
DC.B '$',PR4-* 
BSR.L EXPR yep. Evaluate expression (MRL) 
BSR GOOUT print low byte (MRL) 
BRA PR3 look for more. (MRL) 
PR4 BSR.L QTSTG is it a string? 
BRA.S PR8 if not, must be an expression 
PR3 BSR.L TSTC if ",", go find next 
DC.B "> "pPRO=* 
BSR.L FIN in the list. 
BRA PRO 
PR6 BSR.L CRLF list ends here 
BRA FINISH 
PR8 MOVE D4,-(SP) save the width value 
BSR.L EXPR evaluate the expression 
MOVE (SP)+,D4 restore the width 
MOVE.L DO,D1 
BSR.L PRTNUM print its value 
BRA PR3 more to print? 
FINISH BSR.L FIN Check end of command 
BRA.L QWHAT print "What?" if wrong 


x 
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* 
* *** GOSUB *** & RETURN *** 

* 

* 'GOSUB expr:' or 'GOSUB expr<CR>' is like the 'GOTO' command, 

* except that the current text pointer, stack pointer, etc. are 
saved so that execution can be continued after the subroutine 

* 'RETURN's. In order that 'GOSUB' can be nested (and even 

* recursive), the save area must be stacked. The stack pointer 

* is saved in 'STKGOS'. The old 'STKGOS' is saved on the stack. 
* If we are in the main routine, 'STKGOS' is zero (this was done 
* in the initialization section of the interpreter), but we still 
* save it as a flag for no further 'RETURN's. 


* 
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* 'RETURN<CR>' undoes everything that 'GOSUB' did, and thus 
* returns the execution to the command after the most recent 


* 'GOSUB'. If 'STKGOS' is zero, 


* a 'GOSUB' and is thus an error. 


* 


GOSUB BSR.L PUSHA 
BSR.L EXPR 
MOVE.L AO, - (SP) 
MOVE .L bDO,D1 
BSR.L FNDLN 
BNE.L AHOW 
MOVE .L CURRNT, - (SP) 
MOVE .L STKGOS, - (SP) 
CLR.L LOPVAR 
MOVE .L SP, STKGOS 
BRA RUNTSL 


RETURN BSR.L ENDCHK 
MOVE .L STKGOS, D1 
BEQ.L QWHAT 
MOVE .L D1,SP 
MOVE .L (SP) +, STKGOS 
MOVE .L (SP) +, CURRNT 
MOVE.L (SP) +,A0 
BSR.L POPA 
BRA FINISH 


* 


it indicates that we never had 


save the current 'FOR' parameters 


get line number 
save text pointer 


find the target line 

if not there, say "How?" 
found it, save old 'CURRNT'... 
and 'STKGOS' 

load new values 


there should be just a <CR> 
get old stack pointer 

if zero, it doesn't exist 
else restore it 

and the old 'STKGOS' 

and the old 'CURRNT' 

and the old text pointer 

and the old 'FOR' parameters 
and we are back home 
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* kk FOR *** & NEXT *** 


* 'FOR' has two forms: 


* 'FOR var=expl TO exp2 STEP expl' and 'FOR var=expl TO exp2' 
* The second form means the same thing as the first form with a 


+ + F 


the text pointer, etc. in the 
* 'LOPVAR', 'LOPINC', 'LOPLMT', 


STEP of positive 1. The interpreter will find the variable 'var' 
and set its value to the current value of 'expl'. It also 
evaluates 'exp2' and 'expl' and saves all these together with 
'FOR' save area, which consisits of 
"LOPLN', and 'LOPPT'. If there is 


* already something in the save area (indicated by a non-zero 
* 'LOPVAR'), then the old save area is saved on the stack before 


* the new values are stored. The interpreter will then dig in the 


* stack and find out if this same variable was used in another 


* currently active 'FOR' loop. 


If that is the case, then the old 


'FOR' loop is deactivated. (i.e. purged from the stack) 


of the 'FOR' loop. The control variable 'var' is checked with 


* 
* 
* 'NEXT var' serves as the logical (not necessarily physical) end 
* 
* 


the 'LOPVAR'. If they are not the same, the interpreter digs in 
* the stack to find the right one and purges all those that didn't 
* match. Either way, it then adds the 'STEP' to that variable and 
* checks the result with against the limit value. If it is within 


* 

* 'FOR'. If it's outside the limit, 
* execution continues. 

* 

FOR BSR.L PUSHA 


BSR.L SETVAL 
MOVE .L A6, LOPVAR 
LEA TABS, Al 


the limit, control loops back to the command following the 


the save area is purged and 


save the old 'FOR' save area 
set the control variable 
save its address 

use 'EXEC' to test for 'TO' 
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LEA TAB5.1,A2 
BRA EXEC 
FR1 BSR.L EXPR evaluate the limit 
MOVE .L DO, LOPLMT save that 
LEA TAB6,Al use 'EXEC' to look for the 
LEA TAB6.1,A2 word 'STEP' 
BRA EXEC 
FR2 BSR.L EXPR found it, get the step value 
BRA FR4 
FR3 MOVEQ #1,D0 not found, step defaults to 1 
FR4 MOVE.L DO, LOPINC save that too 
FR5 MOVE.L CURRNT, LOPLN save address of current line number 
MOVE.L AO, LOPPT and text pointer 
MOVE.L SP,A6 dig into the stack to find 'LOPVAR' 
BRA FR7 
FR6 ADD.L #20,A6 look at next stack frame 
FR7 MOVE.L (A6) ,DO is it zero? 
BEQ FR8 if so, we're done 
CMP.L LOPVAR, DO same as current LOPVAR? 
BNE FR6 nope, look some more 
MOVE .L SP,A2 Else remove 5 long words from... 
MOVE .L A6,Al1 inside the stack. 
LEA 20,A3 


ADD.L Al,A3 
BSR.L MVDOWN 


MOVE.L A3,SP set the SP 5 long words up 

FR8 BRA FINISH and continue execution 

NEXT BSR.L TSTV get address of variable 
BCS.L QWHAT if no variable, say "What?" 
MOVE.L DO,A1 save variable's address 

NXO MOVE.L LOPVAR, DO If 'LOPVAR' is zero, we never... 
BEQ.L QWHAT had a FOR loop, so say "What?" 
CMP.L DO,Al else we check them 
BEQ NX3 OK, they agree 
BSR.L POPA nope, let's see the next frame 
BRA NXO 

NX3 MOVE.L (Al) ,DO get control variable's value 
ADD.L LOPINC, DO add in loop increment 
BVS.L QHOW say "How?" for 32-bit overflow 
MOVE.L DO, (Al) save control variable's new value 
MOVE.L LOPLMT, D1 get loop's limit value 
TST.1 LOPINC 
BPL NX1 branch if loop increment is positive 
EXG DO,D1 

NX1 CMP.L bDO,D1 test against limit 
BLT NX2 branch if outside limit 
MOVE.L LOPLN, CURRNT Within limit, go back to the... 
MOVE.L LOPPT, AO saved 'CURRNT' and text pointer. 
BRA FINISH 

NX2 BSR.L POPA purge this loop 
BRA FINISH 


x 
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* ee REM *e* TE SKS TNPUL *** LET (& DEFLT) *** 


* 'REM' can be followed by anything and is ignored by the 
* interpreter. 


* 


+ + + + 4% 


+ * F 


+ + + F + + F 


* 


* 
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‘IF' is followed by an expression, as a condition and one or 
More commands (including other 'IF's) separated by colons. 

Note that the word 'THEN' is not used. The interpreter evaluates 
the expression. If it is non-zero, execution continues. If it 
is zero, the commands that follow are ignored and execution 
continues on the next line. 


‘INPUT' is like the 'PRINT' command, and is followed by a list 
of items. If the item is a string in single or double quotes, 
or is an underline (back arrow), it has the same effect as in 
"PRINT'. If an item is a variable, this variable name is 
printed out followed by a colon, then the interpreter waits for 
an expression to be typed in. The variable is then set to the 
value of this expression. If the variable is preceeded by a 
string (again in single or double quotes), the string will be 
displayed followed by a colon. The interpreter the waits for an 
expression to be entered and sets the variable equal to the 
expression's value. If the input expression is invalid, the 
interpreter will print "What?", "How2?", or "Sorry" and reprint 
the prompt and redo the input. The execution will not terminate 
unless you press control-C. This is handled in 'INPERR'. 


'LET' is followed by a list of items separated by commas. 
Each item consists of a variable, an equals sign, and an 


* expression. The interpreter evaluates the expression and sets 
* the variable to that value. The interpreter will also handle 
* 'LET' commands without the word 'LET'. This is done by 'DEFLT'. 
* 
REM BRA IF2 skip the rest of the line 
IF BSR.L EXPR evaluate the expression 
IF1 TST.L DO is it zero? 
BNE RUNSML if not, continue 
IF2 MOVE .L AO, Al 
CLR.L Dl 
BSR.L FNDSKP if so, skip the rest of the line 
BCC RUNTSL and run the next line 
BRA.L WSTART if no next line, do a warm start 
INPERR MOVE .L STKINP, SP restore the old stack pointer 
MOVE .L (SP) +, CURRNT and old 'CURRNT' 
ADDQ.L #4,SP 
MOVE .L (SP) +,A0 and old text pointer 
INPUT MOVE.L AO,- (SP) save in case of error 
BSR.L QTSTG is next item a string? 
BRA.S IP2 nope 
BSR.L TSTV yes, but is it followed by a variable? 
BCS IP4 if not, branch 
MOVE.L DO,A2 put away the variable's address 
BRA IP3 if so, input to variable 
IP2 MOVE .L AO,- (SP) save for 'PRTSTG' 
BSR.L TSTV must be a variable now 
BCS.L QWHAT "What?" it isn't? 
MOVE .L DO,A2 put away the variable's address 
MOVE .B (AO) ,D2 get ready for 'PRTSTG' 
CLR.B DO 
MOVE .B DO, (AO) 
MOVE .L (SP)+,Al1 
BSR.L PRTSTG print string as prompt 
MOVE .B D2, (AO) restore text 
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IP3 MOVE .L AQ,- (SP) 
MOVE.L CURRNT, - (SP) 
MOVE.L #-1,CURRNT 
MOVE.L SP, STKINP 
MOVE.L A2,-(SP) 
MOVE .B #':',DO 
BSR.L GETLN 


LEA BUFFER, AO 
BSR.L EXPR 
MOVE.L (SP)+,A2 
MOVE.L DO, (A2) 
MOVE.L (SP) +, CURRNT 
MOVE.L (SP) +, A0 

IP4 ADDQ.L  #4,SP 
BSR.L TSTC 
DC.B ',',Ip5-* 
BRA INPUT 

IP5 BRA FINISH 

DEFLT CMP .B #CR, (AO) 
BEQ Lar 

LET BSR.L SETVAL 
BSR.L TSTC 
DC.B ',',LT1-* 
BRA LET 

LT1 BRA FINISH 


x 


save in case of error 

also save 'CURRNT' 

flag that we are in INPUT 
save the stack pointer too 
save the variable address 
print a colon first 

then get an input line 
point to the buffer 
evaluate the input 

restore the variable address 
save value in variable 
restore old 'CURRNT' 

and the old text pointer 
clean up the stack 

is the next thing a comma? 


yes, more items 
empty line is OK 
else it is 'LET' 


do the assignment 
check for more 'LET' items 


until we are finished. 


KKK KOK KKK RK KR KOK IKK IR OK OK IK OK OO TORO OK IKI II OKI IKK IK KK RK KK KK 


* 


* **k LOAD *** & SAVE *** 


device such as a cassette, another computer, etc. The program 


* 

* These two commands transfer a program to/from an auxiliary 
* 

* 


is converted to an easily-stored format: each line starts with 


* 

* 

* 

* the 68000. 

* 

LOAD MOVE.L  TXTBGN, AQ 
MOVE.B  #CR, DO 
BSR GOAUXO 

LOD1 BSR GOAUXI 
BEQ LOD1 
CMP.B #'@', DO 
BEQ LODEND 
CMP.B = #':', DO 
BNE LOD1 
BSR GBYTE 
MOVE.B D1, (A0)+ 
BSR GBYTE 
MOVE.B D1, (AO) + 

LOD2 BSR GOAUXI 
BEQ LOD2 


MOVE .B DO, (AO) + 
CMP .B #CR, DO 
BNE LOD2 

BRA LOD1 


a colon, the line no. as 4 hex digits, and the rest of the line. 
At the end, a line starting with an '@' sign is sent. This 
format can be read back with a minimum of processing time by 


set pointer to start of prog. area 
For a CP/M host, tell it we're ready... 
by sending a CR to finish PIP command. 
look for start of line 


end of program? 


if not, is it start of line? 
if not, wait for it 

get first byte of line no. 
store it 

get 2nd bye of line no. 
store that, too 

get another text char. 


store it 

is it the end of the line? 
if not, go back for more 
if so, start a new line 
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LODEND MOVE .L AO, TXTUNF set end-of program pointer 
BRA WSTART back to direct mode 
GBYTE MOVEQ #1,D2 get two hex characters from auxiliary 
CLR D1 and store them as a byte in Dl 
GBYTE1 BSR GOAUXI get a char. 
BEQ GBYTE1 
CMP .B #'A',DO 
BCS GBYTE2 
SUBQ.B #7,D0 if greater than 9, adjust 
GBYTE2 AND.B #SF,D0 strip ASCII 
LSL.B #4,D1 put nybble into the result 
OR.B DO,D1 
DBRA D2, GBYTE1 get another char. 
RTS 
SAVE MOVE .L TXTBGN, AO set pointer to start of prog. area 
MOVE.L TXTUNF, Al set pointer to end of prog. area 
SAVE1 MOVE .B #CR, DO send out a CR & LF (CP/M likes this) 
BSR GOAUXO 
MOVE .B #LF,DO 
BSR GOAUXO 
CMP .L AO,A1 are we finished? 
BLS SAVEND 
MOVE .B #°s", D0 if not, start a line 
BSR GOAUXO 
MOVE .B (AQ) +,D1 send first half of line no. 
BSR PBYTE 
MOVE.B (A0)+,D1 and send 2nd half 
BSR PBYTE 
SAVE2 MOVE .B (AO) +,D0 get a text char. 
CMP .B #CR, DO is it the end of the line? 
BEQ SAVE1 if so, send CR & LF and start new line 
BSR GOAUXO send it out 
BRA SAVE2 go back for more text 
SAVEND MOVE .B #'@',DO send end-of-program indicator 
BSR GOAUXO 
MOVE .B #CR,DO followed by a CR & LF 
BSR GOAUXO 
MOVE .B #LF,DO 
BSR GOAUXO 
MOVE .B #$1A,D0 and a control-Z to end the CP/M file 
BSR GOAUXO 
BRA WSTART then go do a warm start 
PBYTE MOVEQ #1,D2 send two hex characters from Dl's low byte 
PBYTE1 ROL.B #4,D1 get the next nybble 
MOVE .B D1,D0 
AND.B #SF,D0 strip off garbage 
ADD.B #'0',DO make it into ASCII 
CMP .B #'9',DO 
BLS PBYTE2 
ADDQ.B #7,D0 adjust if greater than 9 
PBYTE2 BSR GOAUXO send it out 
DBRA D2, PBYTE1 then send the next nybble 


RTS 


92 DR. DOBB'S TOOLBOOK OF 68000 PROGRAMMING 


x 


FRI OOO IOI IO COO IO OOO OI IO IO OR IO IO IO I OK 


* 


wk KKK POKE KKK & CALL KKK 
* 
* '"POKE exprl,expr2' stores the byte from 'expr2' into the memory 
* address specified by ‘exprl'. 
* 
* "CALL expr' jumps to the machine language subroutine whose 
* starting address is specified by 'expr'. The subroutine can use 
* all registers but must leave the stack the way it found it. 
* The subroutine returns to the interpreter by executing an RTS. 
* 
POKE BSR EXPR get the memory address 
BSR.L TSTC it must be followed by a comma 
DC.B t)? pPKER=* 
MOVE.L DO,-(SP) save the address 
BSR EXPR get the byte to be POKE'd 
MOVE.L (SP) +,A1 get the address back 
MOVE .B DO, (Al) store the byte in memory 
BRA FINISH 
PKER BRA.L QWHAT if no comma, say "What?" 
CALL BSR EXPR get the subroutine's address 
TST.L DO make sure we got a valid address 
BEQ.L QHOW if not, say "How?" 
MOVE.L AO,- (SP) save the text pointer 
MOVE .L DO,Al 
JSR (Al) jump to the subroutine 
MOVE.L (SP) +,A0 restore the text pointer 
BRA FINISH 


x 


ORK IK IK IO IO IO IO IR FOR IO IOI TOR IO IOI OO IO IO II IO IK IO 


* 


x 


* 


*kk EXPR *** 


"EXPR' evaluates arithmetical or logical expressions. 
<EXPR>: :=<EXPR2> 
<EXPR2><rel.op.><EXPR2> 

where <rel.op.> is one of the operators in TAB8 and the result 
of these operations is 1 if true and 0 if false. 
<EXPR2>::=(+ or -)<EXPR3>(+ or -)<EXPR3>(... 
where () are optional and (... are optional repeats. 
<EXPR3>: :=<EXPR4>( <* or /><EXPR4> ) (... 
<EXPR4>: :=<variable> 

<function> 

(<EXPR>) 
<EXPR> is recursive so that the variable '@' can have an <EXPR> 
as an index, functions can have an <EXPR> as arguments, and 
<EXPR4> can be an <EXPR> in parenthesis. 


EXPR BSR EXPR2 
MOVE.L DO,-(SP) save <EXPR2> value 
LEA TAB8, Al look up a relational operator 
LEA TAB8.1,A2 
BRA EXEC go do it 
XP11 BSR XP18 is it. ">="? 
BLT XPRTO no, return DO=0 


BRA XPRT1 else return DO=1 


XP12 


XP13 


XP14 


XP15 


XP15RT 


XP16 


XP16RT 


XPRTO 


XPRT1 


XP17 


XP18 


EXPR2 


XP21 


XP22 
XP23 


XP24 


XP25 


XP26 


XP18 
XPRTO 
XPRT1 


XP18 
XPRTO 
XPRT1 


XP18 
XPRTO 
XPRT1 


XP18 
XPRTO 
XPRT1 


XP18 
XPRTO 
XPRT1 


DO 


#1,D0 


(SP) +,D0 


(SP) +,D0 
(SP) +,D1 
DO; ~— (SP) 
D1,-(SP) 
EXPR2 

(SP) +,D1 
DO,D1 


TSTC 

tot XP2LE* 
DO 

XP26 

TSTC 
Sbt RP 2Z=* 
EXPR3 

TSTC 
ut" XP 25—* 
DO,-(SP) 
EXPR3 
(SP)+,D1 
D1,DO 
QHOW 

XP23 

TSTC 

‘-', XP42-* 
DO,-(SP) 
EXPR3 

DO 

XP24 


TINY BASIC 93 


is it "<>"? 
no, return DO=0 
else return DO=1 


is at, ">"? 
no, return DO=0 
else return DO=1 


is it "<="? 
no, return DO=0 
else return DO=1 
is it "="? 


if not, return DO=0 
else return DO=1 


is if “<2 
if not, return D0=0 
else return DO=1 


return D0=0 (false) 


return DO=1 (true) 


it's not a rel. operator 


return D0=<EXPR2> 


reverse the top two stack items 


do second <EXPR2> 


compare with the first result 
return the result 


negative sign? 
yes, fake '0Q-' 
positive sign? ignore it 


first <EXPR3> 
add? 


yes, save the value 
get the second <EXPR3> 


add it to the first <EXPR3> 
branch if there's an overflow 
else go back for more operations 
subtract? 


yes, save the result of lst <EXPR3> 
get second <EXPR3> 

change its sign 

and do an addition 
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EXPR3 
XP31 


XP34 


EXPR4 


XP40 


EXP4RT 
XP41 


PARN 


TSTV 
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BSR 
BSR.L 
DC.B 
MOVE.L 
BSR 
MOVE.L 
BSR.L 
BRA 
BSR.L 
DC.B 
MOVE.L 
BSR 
MOVE .L 
EXG 
BSR.L 
BRA 


LEA 
LEA 
BRA 
BSR 
BCS 
MOVE .L 
CLR.L 
MOVE .L 
RTS 
BSR.L 
MOVE .L 
TST 
BNE 
BSR.L 
DC.B 
BSR 
BSR.L 
DC.B 
RTS 
BRA.L 


EXPR4 

TSTC 

Le, XP34=* 
DO,-(SP) 
EXPR4 

(SP) +,D1 
MULT32 
XP31 

TSTC 

'/*, RPaz=* 
DO,-(SP) 
EXPR4 

(SP) +,D2 
DO,D1 
DIV32 

XP31 


TAB4, Al 
TAB4.1,A2 
EXEC 
TSTV 

XP41 
DO,A1 

DO 

(Al) ,DO 


TSTNUM 
D1,D0 

D2 

EXP4RT 
TSTC 
"(',XP43-* 
EXPR 

TSTC 

1)", XP43=* 


QWHAT 


get first <EXPR4> 
multiply? 


yes, save that first result 
get second <EXPR4> 


multiply the two 

then look for more terms 

divide? 

save result of lst <EXPR4> 

get second <EXPR4> 

do the division 

go back for any more terms 

find possible function 

nope, not a function 

nor a variable 

if a variable, return its value in DO 
or is it a number? 

(if not, # of digits will be zero) 


if so, return it in DO 
else look for ( EXPR ) 


else say "What?" 


Test for a valid variable name. Returns Carry=1 if not 
found, else returns Carry=0 and the address of the 
variable in DO. 


BSR.L 
CLR.L 
MOVE .B 
SUB.B 


IGNBLK 
DO 

(AQ) ,DO 
#'@',DO 
TSTVRT 
TV1 
#1,A0 
PARN 
DO,DO 
QHOW 
DO,DO 
QHOW 
DO, - (SP) 
SIZE 
(SP) +,D1 
D1,D0 
QSORRY 


look at the program text 


C=1: not a variable 

branch if not "@" array 

If it is, it should be 

followed by (EXPR) as its index. 


say "How?" if index is too big 


save the index 

get amount of free memory 
get back the index 

see if there's enough memory 
if not, say "Sorry" 


Tvl 


TSTVRT 


MLT1 


MLT2 


MLT3 


WER Riek 


MLTRET 


DIV1 
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MOVE .L VARBGN, DO put address of array element... 
SUB.L D1,D0 into DO 

RTS 

CMP .B #27,D0 if not @, is it A through 2? 
EOR #1,CCR 

BCS TSTVRT if not, set Carry and return 
ADDQ #1,A0 else bump the text pointer 

ADD DO, DO compute the variable's address 
ADD DO,DO 

MOVE .L VARBGN, D1 

ADD D1,D0 and return it in DO with Carry=0 
RTS 


Multiplies the 32 bit values in DO and Dl, returning 
the 32 bit result in DO. 


MOVE .L D1,D4 


EOR.L DO,D4 see if the signs are the same 
TST.L DO take absolute value of DO 

BPL MLT1 

NEG.L DO 

TST.L D1 take absolute value of Dl 

BPL MLT2 

NEG.L D1 

CMP .L #SFFFF,D1 is second argument <= 16 bits? 
BLS MLT3 OK, let it through 

EXG DO,D1 else swap the two arguments 

CMP .L #SFFFF,D1 and check 2nd argument again 
BHI.L QHOW one of them MUST be 16 bits 

MOVE DO,D2 prepare for 32 bit X 16 bit multiply 
MULU D1,D2 multiply low word 

SWAP DO 

MULU D1,D0O multiply high word 

SWAP DO 

Murray's bug correction follows: 

TST DO if lower word not 0, then overflow 
BNE.L QHOW if overflow, say "How?" 

ADD.L D2,D0 DO now holds the product 

BMI.L QHOW if sign bit set, it's an overflow 
TST.L D4 were the signs the same? 

BPL MLTRET 

NEG.L DO if not, make the result negative 
RTS 


Divide the 32 bit value in DO by the 32 bit value in Dl. 
Returns the 32 bit quotient in DO, remainder in Dl. 


TS L D check for divide-by-zero 
BEQ.L QHOW if so, say "How?" 

MOVE.L D1,D2 

MOVE .L D1,D4 


EOR.L DO,D4 see if the signs are the same 
TST.L DO take absolute value of DO 

BPL DIV1 

NEG.L DO 

TST.L D1 take absolute value of D1 

BPL DIV2 


NEG.L D 
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DIV2 MOVEQ #31,D3 iteration count for 32 bits 
MOVE .L DO,D1 
CLR.L DO 
DIV3 ADD.L D1, DL (This algorithm was translated from 
ADDX.L DO,DO the divide routine in Ron Cain's 
BEQ DIV4 Small-C run time library.) 
CMP .L D2,D0 
BMI DIV4 
ADDQ.L #1,D1 
SUB.L D2,D0 
DIV4 DBRA D3, DIV3 
EXG DO,D1 put rem. & quot. in proper registers 
TST.L D4 were the signs the same? 
BPL DIVRT 
NEG.L DO if not, results are negative 
NEG.L D1 
DIVRT RTS 
* 
* ===== The PEEK function returns the byte stored at the address 
* contained in the following expression. 
* 
PEEK BSR PARN get the memory address 
MOVE.L DO,Al 
CLR.L DO upper 3 bytes will be zero 
MOVE .B (Al) ,DO get the addressed byte 
RTS and return it 
* 
* ===== The RND function returns a random number from 1 to 
x the value of the following expression in DO. 
* 
RND BSR PARN get the upper limit 
TST.L DO it must be positive and non-zero 
BEQ.L QHOW 
BMI.L QHOW 
MOVE .L DO,D1 
MOVE.L RANPNT, Al get memory as a random number 
CMP .L #LSTROM, Al 
BCS RA1 
LEA START, Al wrap around if end of program 
RA1 MOVE .L (A1) +,D0 get the slightly random number 
BCLR #31,D0 make sure it's positive 
MOVE.L Al, RANPNT (even I can do better than this!) 
BSR DIV32 RND (n) =MOD (number, n) +1 
MOVE .L D1,D0 MOD is the remainder of the div. 
ADDQ.L #1,D0 
RTS 
* 
* ===== The ABS function returns an absolute value in DO. 
* 
ABS BSR PARN get the following expr.'s value 
TST.L DO 
BPL ABSRT 
NEG.L DO if negative, complement it 
BMI.L QHOW if still negative, it was too big 


ABSRT RTS 


* ===== The SIZE function returns the size of free memory in DO. 
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* 


SIZE MOVE.L VARBGN, DO get the number of free bytes... 
SUB.L TXTUNF, DO between 'TXTUNF' and 'VARBGN' 
RTS return the number in DO 


* 


ROKK KK KKK KK IK I IK KK RIK IKK OK OK OK OK OI I IO I IO IO TOK ITO IO KR IK KK KK 


*x** SETVAL *** FIN *** ENDCHK *** ERROR (& friends) *** 


'SETVAL' expects a variable, followed by an equal sign and then 
an expression. It evaluates the expression and sets the variable 
to that value. 


* 

* 

* 'FIN' checks the end of a command. If it ended with ":", 
* execution continues. If it ended with a CR, it finds the 
* the next line and continues from there. 


* 'ENDCHK' checks if a command is ended with a CR. This is 
required in certain commands, such as GOTO, RETURN, STOP, etc. 


* 


* 


* 


'ERROR' prints the string pointed to by AO. It then prints the 
* line pointed to by CURRNT with a "?" inserted at where the 

* old text pointer (should be on top of the stack) points to. 

* Execution of Tiny BASIC is stopped and a warm start is done. 

* If CURRNT is zero (indicating a direct command), the direct 

* command is not printed. If CURRNT is -1 (indicating 

* '"INPUT' command in progress), the input line is not printed 

* and execution is not terminated but continues at 'INPERR'. 
* 


* Related to 'ERROR' are the following: 

* 'QWHAT' saves text pointer on stack and gets "What?" message. 
* '"AWHAT' just gets the "What?" message and jumps to 'ERROR'. 

* 'QSORRY' and 'ASORRY' do the same kind of thing. 

* 'QHOW' and 'AHOW' also do this for "How?". 


SETVAL BSR TSTV variable name? 
BCS QWHAT if not, say "What?" 
MOVE .L DO,-(SP) save the variable's address 
BSR.L TSTC get past the "=" sign 
Dc.B tee" SVIE* 
BSR EXPR evaluate the expression 
MOVE .L (SP) +,A6 
MOVE .L DO, (A6) and save its value in the variable 
RTS 
sv1 BRA QWHAT if no "=" sign 
FIN BSR.L TSTC HEE PIN: BR 
DC.B Ne ET IS® 
ADDQ.L #4,SP if ":", discard return address 
BRA RUNSML continue on the same line 
FI1 BSR.L TSTC fot “S",> as at a@ CR? 
DC.B CRyFI2=* 
ADDQ.L #4,SP yes, purge return address 
BRA RUNNXL execute the next line 
FI2 RTS else return to the caller 


ENDCHK BSR.L IGNBLK 
CMP .B #CR, (AO) does it end with a CR? 
BNE QWHAT if not, say "WHAT?" 
RTS 
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QWHAT MOVE .L AO,-(SP) 

AWHAT LEA WHTMSG, A6 

ERROR BSR.L PRMESG 
MOVE .L (SP) +,A0 
MOVE.L CURRNT, DO 


BEQ WSTART 
CMP .L #-1,D0 
BEQ INPERR 
MOVE .B (AO) ,- (SP) 
CLR.B (AO) 


MOVE.L CURRNT, Al 
BSR.L PRTLN 
MOVE .B (SP) +, (AO) 
MOVE .B #'?',DO 
BSR GOOUT 

CLR DO 

SUBQ.L #1,Al 
BSR.L PRTSTG 


BRA WSTART 
QSORRY MOVE .L AO, -(SP) 
ASORRY LEA SRYMSG, A6 

BRA ERROR 
QHOW MOVE.L AO,- (SP) 
AHOW LEA HOWMSG, A6 

BRA ERROR 


* 


display the error message 
restore the text pointer 

get the current line number 

if zero, do a warm start 

is the line no. pointer = -1? 
if so, redo input 

save the char. pointed to 

put a zero where the error is 
point to start of current line 
display the line in error up to the 0 
restore the character 

display a "2?" 


point back to the error char. 
display the rest of the line 
and do a warm start 


Error: "How?" 


IKI II IK OO OK IO ORO IO IOI IO TOR IOI IOI I IOI OK IO ITOK I KOK IK I IK IO 


* *** GETLN *** FNDLN (& friends) 


‘GETLN' 


the character in DO 


KKK 


reads in input line into 'BUFFER'. It first prompts with 


(given by the caller), then it fills the 


buffer and echos. It ignores LF's but still echos 


entered 
whole li 
* and caus 


* 'FNDLN' 


* text save area. 


* 
* 
* 
* them back. Control-H 
* 
* 


(if there is one), 


is used to delete the last character 
and control-X is used to delete the 


ne and start over again. CR signals the end of a line, 


es 'GETLN' to return. 


finds a line with a given line no. (in Dl) in the 


Al is used as the text pointer. If the line 


is found, Al will point to the beginning of that line 


(i.e. the high byte 


is found, Al points 


* 
* 
* If that 
* 
* 


the end 


of the line no.), and flags are NC & Z. 


line is not there and a line with a higher line no. 


there and flags are NC & NZ. If we reached 


of the text save area and cannot find the line, flags 


will initialize Al to the beginning of the text save 
start the search. Some other entries of this routine 


will start with Al and search for the line no. 


find a CR and then start search. 
and then starts the search. 


display the prompt 
and a space 


AO is the buffer pointer 
check keyboard 
wait for a char. to come in 


* are C & NZ. 
* 'FNDLN' 
* area to 
* will not initialize Al and do the search. 
* 'FNDLNP' 
* 'PNDNXT' will bump Al by 2, 
* 'FNDSKP' uses Al to find a CR, 
* 
GETLN BSR GOOUT 
MOVE .B #' ',DO 
BSR GOOUT 
LEA BUFFER, AO 
GL1 BSR.L CHKIO 
BEQ GL1 
CMP .B #CTRLH, DO 


delete last character? 


GL2 


GL3 


GL4 


GL5S 


GL6 


GL7 


FNDLN 


FNDLNP 


FNDRET 


FNDNXT 


FNDSKP 


BEQ 
CMP .B 
BEQ 
CMP .B 
BEQ 
CMP .B 
BCS 
MOVE .B 
BSR 
CMP .B 
BEQ 
CMP .L 
BCS 
MOVE .B 
BSR 
MOVE .B 
BSR 
CMP.L 
BLS 
MOVE .B 
BSR 
SUBQ.L 
BRA 
MOVE.L 
SUB.L 
BEQ 
SUBQ 
MOVE .B 
BSR 
MOVE .B 
BSR 
MOVE .B 
BSR 
DBRA 
LEA 
BRA 
MOVE .B 
BSR 
RTS 


CMP.L 
BCC 
MOVE.L 


MOVE.L 
SUBQ.L 
CMP 
BCS 
MOVE .B 
LSL 
MOVE .B 
SUBQ.L 
CMP 
BCS 
RTS 


ADDQ.L 
CMP .B 


BNE 
BRA 
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GL3 if so 
#CTRLX, DO delete the whole line? 
GL4 if so 
#CR,DO accept a CR 
GL2 
#' ',DO if other control char., discard it 
GL1 
DO, (AO) + save the char. 
GOOUT echo the char back out 
#CR, DO if it's a CR, end the line 
GL7 
# (BUFFER+BUFLEN-1) , AO any more room? 
GL1 yes: get some more, else delete last char. 
#CTRLH, DO delete a char. if possible 
GOOUT 
#’ ',DO 
GOOUT 
#BUFFER, AO any char.'s left? 
GL1 if not 
#CTRLH, DO if so, finish the BS-space-BS sequence 
GOOUT 
#1,A0 decrement the text pointer 
GL1 back for more 
AO,D1 delete the whole line 
#BUFFER, D1 figure out how many backspaces we need 
GL6 if none needed, branch 
#1,D1 adjust for DBRA 
#CTRLH, DO and display BS-space-BS sequences 
GOOUT 
#' ',DO 
GOOUT 
#CTRLH, DO 
GOOUT 
D1,GL5 
BUFFER, AO reinitialize the text pointer 
GL1 and go back for more 
#LF,DO echo a LF for the CR 
GOOUT 
#SFFFF,D1 line no. must be < 65535 
QHOW 
TXTBGN, Al init. the text save pointer 
TXTUNF, A2 check if we passed the end 
#1,A2 
Al,A2 
FNDRET if so, return with Z=0 & C=1 
(Al) +,D2 if not, get a line no. 
#8,D2 
(Al) ,D2 
#1,Al 
D1,D2 is this the line we want? 
FNDNXT no, not there yet 
return the cond. codes 
#2,A1 find the next line 
#CR, (Al) + try to find a CR 
FNDSKP keep looking 


FNDLNP 


check if end of text 
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* 
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* 


+ 


*** MVUP *** MVDOWN *** POPA *** PUSHA *** 


* 

* 'MVUP' moves a block up from where Al points to where A2 points 
* until A1=A3 
* 


* 


'"MVDOWN' moves a block down from where Al points to where A3 
points until A1=A2 


* 


* 
* 'POPA' restores the 'FOR' loop variable save area from the stack 
* 
* 'PUSHA' stacks for 'FOR' loop variable save area onto the stack 
* 
MVUP CMP.L Al,A3 see the above description 

BEQ MVRET 

MOVE .B (Al) +, (A2)+ 

BRA MVUP 


MVRET RTS 


MVDOWN CMP ..L Al,A2 see the above description 
BEQ MVRET 
MOVE.B  -(A1),-(A3) 
BRA MVDOWN 
POPA MOVE .L (SP) +,A6 A6 = return address 
MOVE .L (SP) +, LOPVAR restore LOPVAR, but zero means no more 
BEQ PP1 
MOVE .L (SP) +, LOPINC if not zero, restore the rest 


MOVE .L (SP) +, LOPLMT 
MOVE.L (SP) +, LOPLN 
MOVE .L (SP) +, LOPPT 


PP1l JMP (A6) return 

PUSHA MOVE .L STKLMT,D1 Are we running out of stack room? 
SUB.L SP,D1 
BCC QSORRY if so, say we're sorry 
MOVE .L (SP) +,A6 else get the return address 
MOVE.L LOPVAR, D1 save loop variables 
BEQ PUL if LOPVAR is zero, that's all 
MOVE .L LOPPT, - (SP) else save all the others 


MOVE .L LOPLN, - (SP) 
MOVE.L LOPLMT, - (SP) 
MOVE .L LOPINC, - (SP) 
PUL MOVE .L D1,-(SP) 
JMP (A6) return 


x 


FORK KK IOI RIOR ROKR IORI IO II IOI I IOI IRI IORI TOK IOI I III IKK KOK OK IO OK 


* *** PRTSTG *** QTSTG *** PRTNUM *** PRTLN *** 


* 'PRTSTG' prints a string pointed to by Al. It stops printing 
* and returns to the caller when either a CR is printed or when 
* the next byte is the same as what was passed in DO by the 

* caller. 


* 'OTSTG' looks for an underline (back-arrow on some systems), 
* single-quote, or double-quote. If none of these are found, returns 


+ + F F FF HF 


+ 


+ + + OF 


PRTSTG 
PS1 


PRTRET 


QTSTG 


QT1 


QT2 


QT3 


QT4 


QTS 


PRTNUM 


PN1 


PNOV 


end quote. 


to the caller. 


MOVE .B 
MOVE .B 
CMP .B 
BEQ 
BSR 
CMP .B 
BNE 
MOVE .B 
BSR 
RTS 


If underline, 


'PRTNUM' prints the 32 bit number in Dl, 
needed to pad the number of spaces to the number in D4. 
However, if the number of digits is larger than the no. in 
D4, all digits are printed anyway. Negative sign is also 
printed and counted in, positive sign is not. 


and all. 


DO,D1 
(Al) +,D0 
DO,D1 
PRTRET 
GOOUT 
#CR, DO 
PS1 
#LF,DO 
GOOUT 


TSTC 

et OTS-* 
#'"",DO 
AO, Al 
PRTSTG 
Al,A0 
(SP)+,Al 
#LF,DO 
RUNNXL 
#2,Al 
(Al) 

TSTC 
3 
#0200 op 
QT1 

TSTC 

* 5 cone 
#CR,DO 
GOOUT 
(SP) +,A1 
QT2 


D1,D3 
D4,- (SP) 
#SFF,-(SP) 
D1 

PN1 

D1 

#1,D4 
#10,D1 
PNOV 
D1,DO 
#SFEFF,D1 
TOASCII 
D1,D0 


outputs a CR without a LF. 
or double quote, prints the quoted string and demands a matching 

After the printing, the next 2 bytes of the caller are 
skipped over (usually a short branch instruction). 


TINY BASIC 


'PRTLN' prints the saved text line pointed to by Al 
with line no. 


save the stop character 
get a text character 
same as stop character? 
if so, return 

display the char. 

is it a C.R.? 

no, go back for more 
yes, add a L.F. 


then return 

*** OQTSTG *** 

it is a ™ 

print until another 


pop return address 

was last one a CR? 

if so, run next line 
skip 2 bytes on return 
return 

is it a single quote? 


if so, do same as above 

is it an underline? 

if so, output a CR without LF 
pop return address 

none of the above 


save the number for later 
save the width value 

flag for end of digit string 
is it negative? 

if not 

else make it positive 

one less for width count 

get the next digit 

overflow flag set? 

if not, save remainder 

strip the remainder 

skip the overflow stuff 
prepare for long word division 


If single 


leading blanks are added if 
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TOASCII 


PN3 


PN4 


PN5 


PNRET 


PRTLN 


TCl 
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CLR.W D1 

SWAP D1 

DIVU #10,D1 
MOVE D1,D2 
MOVE DO,D1 
DIVU #10,D1 
MOVE .L D1,D0 
SWAP D1 

MOVE D2,D1 
SWAP D1 

SWAP DO 

MOVE .B DO,-(SP) 
SWAP DO 

SUBQ #1,D4 
TST.L D1 

BNE PN1 
SUBQ #1,D4 
BMI PN4 

MOVE .B #' ',DO 
BSR GOOUT 
DBRA D4, PN3 
TST.L D3 

BPL PNS 
MOVE .B #'-',DO 
BSR GOOUT 
MOVE .B (SP) +,D0 
BMI PNRET 
ADD.B #'0',DO 
BSR GOOUT 
BRA PNS 
MOVE (SP) +,D4 
RTS 

CLR.L D1 

MOVE .B (Al) +,D1 
LSL #8,D1 


MOVE .B (A1)+,D1 
MOVEQ #5,D4 


BSR PRTNUM 
MOVE .B #' ',DO 
BSR GOOUT 
CLR DO 

BRA PRTSTG 


zero out low word 
high word into low 
divide high word 
save quotient 

low word into low 
divide low word 

DO = remainder 

R/Q becomes Q/R 

D1 is low/high 

Dl is finally high/low 
get remainder 

stack it as a digit 


decrement width count 
if quotient is zero, we're done 


adjust padding count for DBRA 

skip padding if not needed 

display the required leading spaces 
is number negative? 

if so, display the sign 

now unstack the digits and display 


until the flag code is reached 
make into ASCII 


restore width value 


get the binary line number 


display a 5 digit line no. 
followed by a blank 


stop char. is a zero 
display the rest of the line 


Test text byte following the call to this subroutine. If it 
equals the byte pointed to by AO, return to the code following 
the call. If they are not equal, branch to the point 
indicated by the offset byte following the text byte. 


BSR IGNBLK 
MOVE .L (SP) +,Al 
MOVE .B (Al) +,D1 


CMP .B (A0),D1 
BEQ TEl, 
CLR.L D1 


MOVE .B (Al) ,D1 
ADD.L D1,Al 
JMP (Al) 
ADDQ.L #1,A0 
ADDQ.L #1,Al 


ignore leading blanks 

get the return address 

get the byte to compare 

is it = to what AO points to? 
if so 

If not, add the second 

byte following the call to 
the return address. 

jump to the routine 

if equal, bump text pointer 
Skip the 2 bytes following 


TSTNUM 


TN1 


IGNBLK 


IGB1 


IGBRET 


TOUPBUF 


TOUPB1 


TOUPBRT 


DOQUO 


TINY BASIC 


JMP (Al) 


the call and continue. 


See if the text pointed to by AO is a number. If so, 
return the number in Dl and the number of digits in D2, 
else return zero in Dl and D2. 


CLR.L D1 

CLR D2 

BSR IGNBLK 

CMP ..B #'0', (AO) 

BCS TSNMRET 

CMP .B #'9"', (AO) 

BHI TSNMRET 

CMP .L #214748364,D1 
BCC QHOW 


MOVE.L  D1,D0 
ADD.L D1,D1 
ADD.L D1,D1 
ADD.L DO,D1 
ADD.L D1,D1 
MOVE .B (AO) +,D0 
AND.L #SF,D0 
ADD.L DO,D1 


ADDQ #1,D2 
BRA TN1 
RTS 


initialize return parameters 


skip over blanks 

is it less than zero? 

if so, that's all 

is it greater than nine? 

if so, return 

see if there's room for new digit 
if not, we've overflowd 

quickly multiply result by 10 


add in the new digit 


increment the no. of digits 


Skip over blanks in the text pointed to by AO. 


CMP ..B #' ', (AO) 
BNE IGBRET 
ADDQ.L  #1,A0 
BRA IGNBLK 
RTS 


see if it's a space 
if so, swallow it 
increment the text pointer 


Convert the line of text in the input buffer to upper 
case (except for stuff between quotes) . 


LEA BUFFER, AO 
CLR.B D1 

MOVE .B (AO) +, D0 
CMP .B #CR, DO 


BEQ TOUPBRT 
CMP .B #'"',DO 
BEQ DOQUO 
CMP .B $e, DO 
BEQ DOQUO 
TST.B D1 

BNE TOUPB1 
BSR TOUPPER 


MOVE .B DO, - (AO) 
ADDQ.L #1,A0 
BRA TOUPB1 
RTS 


TST.B D1 
BNE DOQUO1 
MOVE .B bDO,D1 


set up text pointer 
clear quote flag 

get the next text char. 
is it end of line? 

if so, return 

a double quote? 


or a single quote? 
inside quotes? 

if so, do the next one 
convert to upper case 


store it 


and go back for more 


are we inside quotes? 


iff not, toggle inside-quotes flag 
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BRA TOUPB1 
DOQUO1 CMP .B DO,D1 make sure we're ending proper quote 
BNE TOUPB1 if not, ignore it 
CLR.B D1 else clear quote flag 
BRA TOUPB1 
* 
* ===== Convert the character in DO to upper case 
* 
TOUPPER CMP.B #'a',DO Ls at < la"? 
BCS TOUPRET 
CMP .B #'z',DO Or > NZ" 2 
BHI TOUPRET 
SUB.B #32,D0 if not, make it upper case 


TOUPRET RTS 


* 
* '"CHKIO' checks the input. If there's no input, it will return 
* to the caller with the Z flag set. If there is input, the Z 
* flag is cleared and the input byte is in DO. However, if a 
* control-C is read, 'CHKIO' will warm-start BASIC and will not 
* return to the caller. 
*: 
CHKIO BSR.L GOIN get input if possible 
BEQ CHKRET if Zero, no input 
CMP .B #CTRLC, DO is it control-C? 
BNE CHKRET if not 
BRA.L WSTART if so, do a warm start 


CHKRET RTS 


* ===== Display a CR-LF sequence 


CRLF LEA CLMSG, A6 


Display a zero-ended string pointed to by register A6 


MOVE .B (A6) +,D0 get the char. 

BEQ PRMRET if it's zero, we're done 
BSR GOOUT else display it 

BRA PRMESG 


PRMRET RTS 


ROKK KK TOK I KIO IK KKK I KO IO OK KOK TO IOI IK I TOK IO I IK I 


* The following routines are the only ones that need * 


* to be changed for a different I/O environment. * 
KKK KK IK IK IK KKK KR KKK KKK KKK KKK KK KEK KKK IKKE KEKE KKK RK KK KKK 


* ===== Output character to the console (Port 1) from register DO 

* (Preserves all registers.) 

* 

OUTC BIST #1,$10040 is port 1 ready for a character? 
BEQ OUTC if not, wait for it 
MOVE .B DO, $10042 out it goes. 
RTS 

* 

* ===== Input a character from the console into register DO (or 


= return Zero status if there's no character available). 


INC 


INCRET 


AUXOUT 


+ OF OF 


AUXIN 


BYEBYE 


INITMSG 
OKMSG 
HOWMSG 
WHTMSG 
SRYMSG 
CLMSG 


LSTROM 
* 
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BIST #0,$10040 is character ready? 

BEQ INCRET if not, return Zero status 
MOVE .B $10042,D0 else get the character 

AND .B #$7F,D0 zero out the high bit 

RTS 


Output character to the host (Port 2) from register DO 
(Preserves all registers.) 


BIST #1,$10041 is port 2 ready for a character? 
BEQ AUXOUT if not, wait for it 

MOVE .B DO, $10043 out it goes. 

RTS 


BTST #0,$10041 is character ready? 

BEQ AXIRET if not, return Zero status 
MOVE .B $10043,D0 else get the character 

AND .B #S7F,D0 zero out the high bit 

RTS 


Return to the resident monitor, operating system, etc. 


MOVE .B #228,D7 return to Tutor 

TRAP #14 

DC.B CR, LF, 'Gordo''s MC68000 Tiny BASIC, vl.2',CR, LF, LF, 0 
DC.B CR, LF, 'OK',CR, LF, 0 

DC.B "How?', CR, LF, 0 

DC.B "What ?',CR, LF,0 

DC.B "Sorry.' 

DC.B CR, LF, 0 

DC.B 0 <- for aligning on a word boundary 

EQU x end of possible ROM area 


* Internal variables follow: 


* 


RANPNT 
CURRNT 
STKGOS 
STKINP 
LOPVAR 
LOP INC 
LOPLMT 
LOPLN 

LOPPT 

TXTUNE 
VARBGN 
STKLMT 
BUFFER 
TXT 


DC.L START random number pointer 

DS.L 1 Current line pointer 

DS.L 1 Saves stack pointer in 'GOSUB' 
DS.L 1 Saves stack pointer during ‘INPUT’ 
DS.L 1 'FOR' loop save area 

DS.L 1 increment 

DS.L a limit 

DS.L 2. line number 

DS.L 1 text pointer 

DS.L BE points to unfilled text area 

DS.L 1 points to variable area 

DS.L 1 holds lower limit for stack growth 
DS.B BUFLEN Keyboard input buffer 

EQU * Beginning of program area 
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Comfort: 
A Faster Forth 


Alexander Burger and Ronald Greene 


One of the most effective ways to use a 32-bit processor is 
to run a threaded language, such as Forth. Forth lends itself 
well to the 68000 because of its highly stack-oriented, 
immediate nature. This chapter describes one such Forth 
system, and demonstrates techniques that will help you 
build a threaded Forth system on your own 68000 machine. 


A threaded interpretive language (TIL) like Forth is usually implemented as 
a virtual machine consisting of: 


* two stacks (return and data); 
* an inner and outer interpreter; 
* a set of primitive routines. 


The return stack is used mainly to hold return addresses of deferred 
execution streams; the data stack is the primary device for passing data to and 
from routines. The primitives, which constitute the "instruction set" of the 
virtual TIL machine, usually consist of short routines that perform such tasks 
as moving data to and from the stack and performing arithmetic operations on 
the top stack elements. By threading these primitives together, a programmer 
can create higher-level words (called "secondary words" or "secondaries") in a 
treelike fashion, until the final application program is represented by a single 
top-level word. 

The conventional way to generate a secondary is to write a list of ref- 
erences (e.g., addresses) to primitives or other secondaries that reside at known 
locations in memory. The function of the inner interpreter is to ensure that 
the microprocessor executes the underlying primitives in the proper order. The 
flow from one primitive to another is analogous to a thread connecting a 
string of beads, hence the name "threaded code." 

In the age of 8-bit microprocessors, the primitive operations performed by 
the virtual TIL machine were typically several machine instructions long. The 
limited memory accessible to these processors mandated that the primitives be 
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kept in fixed locations and all references to them be "compiled" as pointers to 
their executable code or as offsets into a jump table. The result was impres- 
sively compact Forth compilers/interpreters and other programs. 


Threading Schemes 

There are four threading methods. Three of these—direct threading, indirect 
threading and token threading—have commonly been used to implement 
Forth and other TILs on various microprocessors. (These three methods are 
referred to collectively as "pointer threading.") 

Direct and indirect threading are similar in that both methods use pointers 
to routines. Direct threading executes somewhat faster than indirect because 
the location pointed to contains executable machine code; in indirect thread- 
ing, this location contains a pointer to the executable code. Indirect threading, 
however, provides better machine independence. Token threading is the slow- 
est method, but it has the advantage of using an address space greater than the 
pointer size. All three methods use special inner interpreter code to extract 
information from the pointer and jump to the executable code of the 
primitive. The execution speed of pointer-threaded code is determined pri- 
marily by the efficiency of this inner interpreter. 


Direct Threading 


Direct threading executes the fastest. In a typical implementation of direct- 
threaded code, a CPU register is reserved for use as the interpreter pointer (IP), 
which acts as the program counter of the virtual machine. The short inner 
interpreter code, which is inserted at the end of each primitive, must jump to 
the routine whose address is contained in the memory location pointed to by 
IP, and increment IP (usually in the reverse order) to point to the next address 
in the list. This inner interpreter code generally requires several machine 
instructions, one of which is a time-consuming jump. For short primitives 
the execution time of the inner interpreter can be significantly longer than the 
rest of the primitive. 

The good news is that the operation of the inner interpreter can be 
performed by a single instruction implemented in almost all microprocessors, 
a feature that speeds execution and saves memory. If the processor hardware 
stack is used as the interpreter pointer, the return-from-subroutine instruction 
will do exactly what we want: pop the next address into the program counter 
and increment the interpreter pointer (the hardware stack pointer) to point to 
the next word in the list. Note that this "stack threading" is not really a new 
threading method; rather, it is an especially elegant and efficient means of 
direct threading. 

The bad news has two parts. First, if the hardware stack is used for the 
interpreter pointer, it is not available for use as the data stack of the virtual 
machine. This unavailability could be a serious problem for most 8-bit ma- 
chines, which have too few registers to effectively implement the data stack, 
the return stack and the code of the primitive. 
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The second piece of bad news is that unless interrupts are disabled, the 
processor may automatically save information on the hardware stack at the 
beginning of an interrupt, a procedure that has the effect of overwriting the 
TIL code. This problem is particularly serious for a system used in control 
applications, since it would generally not be feasible to disable interrupts. 
Thus, stack threading is probably not a viable option for 8-bit micro- 
processors. (It may be worthwhile to consider subroutine threading for such 
processors. See comments below.) 

The situation is quite different for the newer, more powerful 16- and 32-bit 
microprocessors, such as the Motorola 68000, Zilog Z8000 and National 
Semiconductor 32016. Each of these processors uses a "supervisor" stack for 
interrupts, so the second problem described above is of no concern. They also 
have a sufficiently powerful instruction set and enough registers to use stack 
threading efficiently. In the 68000, for example, any address register can be 
used as a stack pointer, so that a7 can be freed from its usual task as the 
return stack pointer and used instead as the interpreter pointer. Listing 7.1 
shows an example of 68000 code for a secondary definition using this 
method. All primitives end with the "rts" instruction, which performs the 
task of the inner interpreter. 

When a secondary is entered, the current interpreter pointer (a7) is saved on 
the return stack. (The example uses a6 as the return stack pointer, though any 
other address register would do as well.) The IP is then loaded with the address 
of the second entry in the pointer list (the location containing ADR_2 in the 
listing) using program counter relative addressing, and a jump is made to 
ADR_1. Regardless of whether the routine at ADR_1 is a primitive or a 
secondary, an rts instruction that will pop ADR_2 into the program counter 
will eventually be reached, thus transferring control to the second word in the 
list. 

In this way, the words at all addresses in the list are executed until the 
"semi" routine is reached. This word performs a return from secondary by 
popping the previously saved interpreter pointer from the return stack and 
executing an rts. (The name "semi" refers to the semicolon used in Forth to 
terminate secondary definitions.) 


Macro/Subroutine Threading 

The fourth threading method, subroutine threading, links together pre- 
viously defined words by means of subroutine calls. Like stack threading, this 
method uses the processor's stack to perform the housekeeping of the inner 
interpreter in hardware, and thus executes efficiently. It also requires fewer 
registers than pointer threading, which might increase speed on a processor 
with few registers, since fewer memory references may be required within a 
given primitive. Subroutine threading is rarely used, however, primarily 
because it is less memory efficient than a pointer-threaded technique. In the 
older 8-bit machines, where address space is limited to 64K, subroutine 
threading uses about 50 percent more memory in a secondary than typical 
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pointer-threaded code because each 2-byte address is accompanied by a |-byte 
machine code for a subroutine call. 

Perhaps the major advantage of subroutine threading over pointer threading 
is its ability to use in-line code. For example, it is possible to indicate to the 
Forth compiler that a particular word is to be used as a macro (that is, 
compiled as in-line code rather than as a subroutine call). This ability not 
only eliminates the threading overhead for such words, but also allows simple 
optimization of code, which we will discuss later. 

Figure 7.1 shows that in pointer threading, references to previously defined 
words are compiled as a list of addresses (or some other token). M/S thread- 
ing, on the other hand, compiles executable code—either subroutine calls to 
previously defined words or the actual instructions that make up those words. 
In practice, memory constraints require that only relatively short primitives 
be treated as macros; however, these are precisely the words that benefit most 
from the elimination of threading overhead since their execution times are 
typically much shorter than those of the inner interpreter. 

When many processor instructions are required to implement a single TIL 
instruction, as is the case with most 8-bit machines, it wastes memory to 
write the instructions in-line; a subroutine call uses considerably less memory 
while adding relatively little overhead. However, if only one or two processor 
instructions are required for the TIL instruction (as with the 68000 MPU), it 
is faster and more memory efficient to write the code in-line. (We'll talk about 
this later in more detail.) 


Comparison of M/S and Stack Threading 

The two threading schemes discussed above are complementary in the 
sense that they are most efficient at opposite programming level extremes. 
M/S threading is best for low-level Forth programming in which the ratio of 
primitives to secondaries is one to one or greater. It requires no overhead for 
short primitives, since they are written in-line, and a relatively small amount 
(subroutine call and return) for other words. Stack threading is not efficient for 
low-level code because of the overhead in jumping from one primitive to 
another. For high-level Forth programming, in which new words are defined 
entirely or almost entirely in terms of previously defined secondaries, 
subroutine calls/returns (two jumps) are not as efficient as stack threading, 
which jumps directly from the end of one word to the beginning of the next. 


| | | | 
| | | | 
| : | | : | 
| pointer to wordl | | call wordl | 
{| pointer to word2 | | code for word2 | 
| pointer to word3 | | call word3 | 
| : | | : | 
| | | | 
| | | | 
| =) | | 


FIGURE 7.1. A comparison of the structure of pointer threading 
(left) and macro/subroutine (M/S) threading (right). In the second 
case word2 is treated as a macro. 
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Since both M/S threading and stack threading use the processor's stack to 
perform the task of the inner interpreter in hardware, the dictionary entries of 
all words in either system end with a return from subroutine instruction (rts). 
In M/S threading, the rts actually performs a return from a subroutine; in 
stack threading, it pops the address of the next word into the program counter 
and increments the interpreter pointer. Because of the common terminator, it 
is possible to combine the two threading techniques into a single system that 
uses M/S threading for low-level words and stack threading for high-level 
words. 

While creating our Forth compiler, we devised a scheme for such a system, 
but found it practical only for applications in which there is negligible 
intermixing of primitives and secondaries (rarely the case in typical Forth 
programming). Since with a conventional processor most of the time- 
consuming work is done within the primitives, we chose to use only M/S 
threading for our 68000 Forth compiler. 


An Optimizing M/S Forth Compiler 

M/S threading is one of the most effective ways to implement Forth on 
the 68000 MPU, at least in terms of execution speed. Another threading 
method could produce more compact code, but with a processor such as the 
68000, memory usage shouldn't be the highest priority. We have named this 
TIL implementation Comfort, for Compiled Forth. Although all Forth sys- 
tems contain a compiler, this one produces fully compiled object code rather 
than some intermediate code that requires an inner interpreter to execute. In 
addition to using the M/S threading method described above for speed, the 
compiler applies various optimizations to further improve performance. 
Except for providing for the full address space of the 68000 and using 32-bit 
elements in most operations, it conforms to the Forth-83 standard. 


General Register Usage 

Table 7.1 summarizes general register usage. As usual, register a7 acts as 
the hardware stack pointer (the Forth return stack), which holds return 
addresses and provides temporary storage for intermediate results. As will be 
seen later, a7 is not used for the indices of DO loops, as in traditional Forth 
practice. 


a7 Return stack pointer 

a6 Base pointer 

a5 Data stack pointer 

a4 Index stack pointer 

a0...a3 Scratch 

d7 TOS (Top element of data stack) 
d6 Second element of index stack 
d5 Top element of index stack 


do...d4 Scratch 


TABLE 7.1. Register assignments used in Comfort. 
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Address register a6, which points to the same location throughout Com- 
fort's lifetime, serves as the base address of the work space area. Variable 
accesses are therefore shorter and faster, and the kernel can be coded in a 
position-independent fashion. Since the kernel uses branch and branch-to- 
subroutine instructions, and pe- and a6-relative addressing modes, it can run in 
any address space and under various operating systems. 

Register a5 acts as the data stack pointer. Actually, a5 points to the second 
item on the stack, commonly referred to as the NOS (next on the stack). The 
TOS (top stack element) is kept in data register d7 since it is advantageous to 
keep the main focus of Forth's activities in a register. For this reason, we 
also keep the top two elements of the index stack in registers d5 and d6. 
Address register a4 serves as the index stack pointer, which we will discuss 
later. The remaining registers, dO through d4 and a0 through a3, can be used 
freely as scratch registers by all primitives. 


Examples of Primitives 

Table 7.2, which lists the code for Comfort's most frequently encountered 
primitives, indicates each primitive's size in bytes and execution time in 
machine cycles (excluding the terminating rts instruction). As you can see, 
most of the primitives are only 2 or 4 bytes long—no longer than the 32-bit 
pointer required for pointer threading, and shorter than a jump-to-subroutine 
instruction. 

As an example, consider the "+" routine, which consists of a single 2-byte 
instruction. This instruction adds the 32-bit integer pointed to by a5 (NOS) to 
the contents of d7 (TOS) and increments the stack pointer a5 by 4, effecting a 
pop. Obviously, this primitive should be used as a macro. The execution 
time is only 14 machine cycles, so it would be a waste of space and time if 
an inner interpreter (even just the rts of stack threading) were used to get to 
the next word. 

How does one decide which primitives to implement as macros and which 
as subroutines? A jsr instruction takes 6 bytes on the 68000, so all 
primitives with a length of 6 or fewer bytes (excluding the rts) should be 
defined as macros. Comfort also expands certain time-critical 8-byte and 10- 
byte words in-line, in keeping with its philosophy that favors minimizing 
execution time rather than memory usage. Nevertheless, because of Forth's 
inherent compactness, the resulting object code is still comparable in size to 
a good C compiler. 


M/S Compilation 

The dictionary header has one byte, which we will call the "flag byte," that 
is of special importance to the compiler. Several bit fields in the flag byte are 
dedicated to special tasks (see Figure 7.2). 


Forth Word 
+ 


1+ 

1- 

2* 

2/ 
negate 
and 

or 

xor 


not 
dup 
drop 


over 


swap 


?dup 


el 


(short constant 
S128 dss REY) 
(long constant) 


(address of 
variable) 


TABLE 7.2 Some of the most frequently used primitives. Note that 
in the dictionary each of these primitives would be terminated by 
an rts instruction. The bytes/cycles values indicate the code size 
in bytes and the execution time in machine cycles for the body of 
the primitive (without the call and return). Compare these to the 
overhead of (6/36) for a bsr-rts or (8/38) for a jsr-rts sequence. 
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68000 Mnemonics 
add.1 (a5) +,d7 
neg.1 a7 
add.1 (a5)+,da7 
addq.l #1,d7 
subq.l1 #1,da7 
add.1 da7,a7 
asr.1 #1,a7 
neg.1 d7 
and.1l (a5) +,d7 
OFT (a5)+,da7 
move.l (a5)+,d0 
eor.1 d0,d7 
not.1 a7 
move.l d7,-(a5) 
move.l (a5)+,d7 
move.l d7,-(a5) 
move.l 4(a5),d7 
move.l (a5)+,d0 
move.l d7,-(a5) 
move.l d0,d7 
tst<l a7 
beq.s $+2 
move.l 4d7,-(a5) 
move.l (a5),d0 
move.l d7,-(a5) 
move.l d0,-(a5) 
addq.1 #4,a5 
move.l (a5)+,d7 
lsl.w #2,a7 
move.l1 0(a5,d7.w),da7 
move.l d7,-(a5) 
move.l a5,d7 
movea.l d7,a0 
move.l (a0),da7 
movea.l d7,a0 
moveq #0,da7 
move.b (a0),d7 
movea.l d7,a0 
move.1l (a5)+, (aQ) 
move.l (a5)+,d7 
movea.l d7,a0 
addq.1 #3,a5 
move.b (a5)+, (a0) 
move.l (a5)+,da7 
move.l d7,-(a5) 
moveq #const,d7 
move.l d7,-(a5) 
move.l #const,d7 
move.l d7,-(a5) 
move.l #addr,d7 


Bytes/Cycles 
2/14 
4/20 
2/8 
2/8 
2/8 
2/10 
2/6 
2/14 
2/14 
4/20 
2/6 
2/13 
2/12 
6/29 


6/29 


6/14 or 25 


6/38 


4/20 
6/28 
4/17 
4/16 


6/16 


6/38 


8/37 


4/17 


8/25 


8/25 
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feo o open $a tte tents nant 
| IM | entry | exit | MC | 
fon nf $e $a te tent nnntant 
IM: The standard Forth flag for immediate words 
entry: Flags for the word's entry characteristics 
exit: Flags for the word's exit characteristics 
MC: Macro flag 


FIGURE 7.2 Bit fields in the flag byte of a Comfort dictionary word. 


The most significant bit is the immediate flag of the word. As in many 
Forth implementations, it is used within compile mode to determine if a word 
is to be compiled or executed immediately. The least significant bit in the 
flag byte is the macro flag (MC). If it is 0, the word is too long to be 
expanded in-line and thus will be compiled as a subroutine call. The compiler 
checks the distance from the current dictionary pointer, and if it is not more 
than 128 bytes, Comfort generates a 2-byte short branch-to-subroutine bsr. If 
the distance is 32768 or fewer bytes (as it will be most of the time), a 4-byte 
bsr instruction is compiled; otherwise, a 6-byte jump-to-subroutine jsr 
instruction is used. 

If the MC is 1 (as it will be for the headers of almost all primitives and for 
user-defined secondaries declared macros), the body of the corresponding word 
is ordinarily copied directly, excluding the terminating rts. In some cases, 
however, the compiler can make certain optimizations by modifying the 
primitive code that has been copied. As we will discuss shortly, the entry and 
exit bit fields are used to identify such possibilities. 


Peephole Optimizations 

Many redundant operations take place near the boundaries between Forth 
words. For example, one word may push its result on the stack, while the 
next word immediately pops it off again. Consider the sequence for an indirect 
load of a pointer variable x, expressed in Forth as x @ @. Without 
optimization, this load would be compiled in the following way (refer to 
Table 7.2): 


move.1 d7,-(a5) x 
move. 1 #adr_x,da7 
movea.1l a7,a0 @ 
move.1 (a0) ,da7 

movea.1 da7,a0 @ 
move.1 (a0) ,da7 


Comfort compiles a much simpler 


move.1 a7,- (a5) 
movea.1# adr_x,a0 
movea.1l (a0) ,a0 


move.l (a0) ,a7 
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because the load of d7, followed by a move from d7 to a0, can be patched in 
place by manipulating the destination address field in the last instruction of 
the most recently compiled word. 

The savings are even more significant in the common case that one 
primitive ends with an operation that is annihilated at the start of the next 
one. For example, note what happens if we want to discard the TOS and load 
a 1 as happens in Listing 7.2. Instead of 


move.1 (a5)+,da7 drop 
move.1 da7,-(a5) 1 
moveq #1,a7 

a very short 
moveq #1,da7 


is sufficient, with a code size reduction from 6 to 2 bytes. If the compiler can 
take advantage of such optimizations, it will not only eliminate the overhead 
between primitives, but also produce "negative overhead" by shortening the 
primitives themselves. Because the compiler has to look only at the near 
neighborhood of the current instruction to perform these optimizations, they 
are called "peephole" optimizations. 

How can such optimizations be programmed simply and quickly? The 
Comfort system encodes characteristic information about each word in the 
entry and exit fields of the flag byte (see Figure 7.2). After compiling a word, 
the compiler remembers the word's exit field. Before the next word is 
compiled, the entry field of the new word is ORed with the remembered exit 
field, and the result is used directly as an index into a table of 64 short 
branches to the proper optimization routines. (Some of these peephole op- 
timizations can be inferred from Listing 7.3 and Listing 7.5.) 

Not all exit/entry combinations are amenable to optimization, so many of 
these branches lead to no-operation routines. But if the exit field indicates, for 
example, that the status register flags (zero, negative) will represent the 
contents of the TOS at run time, and the next entry field indicates a 
conditional (if, while, until), the first instruction of the new word, which has 
to test the contents of the TOS, will be skipped in the copying process. 

Whenever the sequence of expanded macros is interrupted by a call to a 
secondary, optimizations are disabled by resetting the saved exit flag to zero. 
Several flow-of-control keywords (such as begin, else and then) must also 
reset the flag to ensure that the program operates correctly. 

We present one final optimization example. At the end of each secondary 
definition, a check is made to see if the last word is to be compiled into a 
subroutine call. If not, an rts is appended to finish the secondary. Otherwise, 
the last bsr or jsr is changed to a bra or jmp and the saved exit flag is reset. 
This procedure optimizes the return to the calling word by avoiding con- 
secutive rts instructions. 
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The Index Stack 

In addition to the two stacks found in other Forth implementations, 
Comfort uses an "index stack," whose primary function is to hold the index 
and limit values of do loops, a task traditionally assigned to the return stack. 
Since the do loop is the most convenient and most frequently used looping 
construct in Forth, and since many programs spend a lot of time in the 
innermost loops, a Forth implementation should try to optimize these loops 
as much as possible. 

In an M/S threaded version of Forth, if the return stack is also used for 
index values, such words as DO and +LOOP, which push or pop loop 
parameters and are not implemented as macros, must do a lot more processing 
to preserve the return address. The index stack solves this problem. To allow 
fast access to the index variable i, and for the word LOOP to be implemented 
as a macro, Comfort reserves two data registers, d5 and d6, to hold the index 
and the limit of the innermost loop. (Listing 7.5 provides an example of the 
code for LOOP.) This approach has two additional advantages: 

1) Words manipulating the loop indices (i, j, k, >i, i>, it, i- . . .) may 
also be used outside a colon definition. 

2) The top of the index stack can be used as a fast register variable. 


Branch Optimization 

The 68000 provides both long and short branch instructions. The short 
version is preferred if the destination lies within -128 and +127 bytes of the 
the current program counter, because it needs only 2 bytes as opposed to 4 
bytes for the long branch. It is also faster in the case of conditional branches 
that are not taken. 

Many assemblers generate short backward branches, since the distance to 
previously defined labels is known at assembly time. Forward branches are 
not as easily resolved because assembly language does not impose restrictions 
on program structure. For a structured language such as Forth, it is feasible to 
optimize forward as well as backward branches. Comfort optimizes the 
branches in all control structures (except LEAVE; see below). 

The key to forward branch optimization in a single pass lies in the block 
organization of structured languages. A block is a sequence of statements that 
has a single entry point and a single exit point. They may be nested to any 
depth. The BEGIN-WHILE-REPEAT sequence in Listing 7.4 is an example of a 
block, which is contained within an IF-THEN block, which in turn is nested in 
a DO-LOOP block. 

Blocks may not overlap, so that when the compiler has to resolve a branch 
enclosing a given block, all branches contained within this block will have 
already been resolved, thus fixing the size of the block. This allows the 
Comfort compiler to proceed as follows: When a branch is first encountered it 
is assumed to be short. Forth functions typically consist of only a few words 
so that the assumption will be correct in most cases. Look, for example, at 
the IF in Listing 7.5. Comfort compiles the hexadecimal code 6700 (beq), 
leaving the branch offset temporarily unresolved, and proceeds to the word I. 
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When it reaches the word THEN, it checks the distance to the unresolved 
branch and, seeing that it is only 50 hex, inserts this byte into the branch 
opcode, making it 6750. 

Should, however, the distance be greater than can be expressed in one byte, 
the portion of the block that follows the branch must be moved down 2 
bytes, and a 16-bit offset must be inserted into the newly allocated space. 
This block move can be done quite quickly using 32-bit memory-to-memory 
move instructions. 

Care must be taken in the case of the IF-ELSE-THEN construct not to 
resolve the IF-ELSE branch as soon as encountering the ELSE, but to wait 
until the THEN is reached. The ELSE-THEN branch should be processed first to 
ensure that the distance from IF to the next instruction after the ELSE will not 
change further. 

A similar caveat holds for the BEGIN-WHILE-REPEAT construct. The 
REPEAT-BEGIN BRANCH is first checked to see if it will be long, before 
resolving the WHILE-REPEAT branch. Finally, the actual length of the 
REPEAT-BEGIN branch is calculated and inserted. This method works because 
the REPEAT-BEGIN distance is always longer than the WHILE-REPEAT 
distance, so that a long branch will never be necessary when the first 
determination yields a short branch. 

An exception to the above discussion is the Forth word LEAVE. Since it is 
used to jump out of one or more blocks, the length of the branch cannot be 
predetermined with a simple compiler as used in Forth. Because of this we 
have chosen to always use long branches in implementing LEAVE. 


Benchmarks 

Listings 7.2 through 7.5 present two simple benchmarks as an illustration 
of the optimization methods described above. Listings 7.3 and 7.5 were 
obtained by compiling the respective source code (Listings 7.2 and 7.4) with 
Comfort and disassembling the resulting object code with a debugger. We 
added the Forth comments by hand and used symbolic names for labels; 
otherwise the mnemonics have not been altered. 

The first example, the recursive definition of the Fibonacci function 


fibo(n) = fibo(n-1) + fibo(n-2) n>1 fibo(1) = fibo(0) = 1 , 


is a good test for the function call overhead. It has great practical value, 
because you can type it in quickly at machines you have only temporary 
access to (e.g., at computer shows or demonstrations), and it can be easily 
translated into any language that allows recursive function definitions. As 
Listing 7.3 shows, this function compiled completely into single-word 
instructions, with a total code size of 40 bytes. The execution time for 


22 fibo . (result: 28657) 
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is only 1.8 seconds on our 8 MHz 68000 (with two wait states). Compare 
this, for example, to the corresponding definition in MacForth (Creative 
Solutions). Although MacForth does not include the word "recurse,” the 
combination "[ smudge ] fibo [ smudge ]" can be used in its place. The code 
generated for fibo by MacForth is a bit shorter (36 bytes), but needs 8.0 
seconds to execute on the Macintosh (also 8 MHz). 

The second example, the quasi-standard Sieve of Eratosthenes, which 
calculates the first 1,899 prime numbers, is suitable to test a language's 
memory access and looping capabilities. Comfort's Sieve benchmark exe- 
cution time for one iteration is .89 seconds on the 8 MHz 68000. 


Conclusion 

We have described an optimizing Forth compiler for the Motorola 68000 
that uses macro/subroutine threading to reduce the execution time overhead 
associated with the inner interpreter and special optimization of certain Forth 
word combinations. M/S threading is the method of choice for this and other 
powerful processors for which there is a nearly one-to-one correspondence 
between the instructions of the actual and virtual machines, since it allows 
execution of many primitives with no overhead. The limiting case of such a 
processor is the so-called "Forth machine," which executes Forth instructions 
directly via hardware or microcode. Although the Forth machine is, of course, 
an extreme example, the fact remains that for powerful processors, such as the 
68000, one way to faster Forth is through M/S threading. 
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Listing 7.1 


Hardware Stack Pointer Threading Example 


x a7 Interpreter pointer (IP) 

* aé Return stack pointer 

* a5 Data stack pointer 

Secondary: 
move.l a7,-(a6) Save IP on return stack 
lea 8(pc),a7 Point IP to the pointer to ADR_2 
jmp ADR_1 And jump to the first routine 
de.1 ADR_2 Address of second routine 
olomee ADR_3 Address of third routine 
de..1 ADR_4 Address of fourth routine 
desi ADR_SEMI Address of semi 


* 


* The routine semi performs a return from the secondary and causes 


* execution to continue with the next higher level word. 
* 


semi: 
movea.l (a6)+,a7 Pop old IP from return stack 
rts Jump to next routine in deferred list 


Listing 7.2 


Source Code for Fibonacci Function 


: fibo 
dup 2 < 
if 
drop 
1 
else 


dup 1- recurse 
swap 2- recurse 
+ 

then 
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Listing 7.3 


The Fibonacci Function, Disassembled and Hand-Commented 


fibo: 
2B07 move.l d7,-(a5) dup 
2B07 move.l d7,-(a5) 2 
7EO2 moveg #2,a7 
201D Move.l (a5)+,d0 < 
2207 move.l d7,dl 
2E1D move.l (a5)+,qa7 
B280 cmp.1 do, dl if 
6F04 ble.s fbl 
7EO1 moveq #1,a7 drop 1 
6012 bra.s fb2 else 
fbl: 2B07 move.l d7,-(a5) dup 
5387 subq.1 #1,d7 b= 
61E6 bsr.s fibo recurse 
201D move.l (a5)+,d0 swap 


2B07 move.l d7,-(a5) 
2E00 move.l d0,d7 


5587 subq.1 #2,da7 2= 
61DC bsr.s fibo recurse 
DE9D add.1 (a5) +,da7 + (then) 


fb2: 4E75 rts 


Listing 7.4 


Source Code for Sieve of Eratosthenes 


decimal 

8190 constant size 
variable flags 
size allot 


: sieve 
flags size 1 fill ( Set array ) 
0 ( Count ) 
size 0 
do 
flags i+ c@ 
it 
i dup + 3 + 
( dup . ) 
dup i + 
begin 
dup size < 
while 
0 over flags + c! 
over + 
repeat 
drop drop 
1+ 
then 
loop 
." Primes" 


cr 


Listing 7.5 


The Word SIEVE, 


sieve: 


svl: 


Sv2: 


Sv3: 


sv4: 


2B07 
2B3C 
2B3C 
7EO1 
6100 
2B07 
7EOO 
2B07 
2B3C 
7E00 
6100 
2B07 
2B3C 
2E05 
DE9D 
2047 
7TEOO 
1E10 
4CDD 
6750 
2B07 
2E05 
2B07 
DE9D 
2B07 
7E03 
DE9D 
2B07 
2B07 
2E05 
DE9D 
2B07 
2B07 
2E3C 
201D 
2207 
2E1D 
B280 
6F20 
2B07 
7E00 
2B07 
2B2D 
2E3C 
DE9D 
2047 
568D 
109D 
2E2D 
DE9D 
60CC 
2E1D 
2E1D 
5287 
5285 
BA86 


000018EC 
OO001FFE 


D94A 


QOO001FFE 


DD88 


000018EC 


0080 


OOOO1FFE 


0004 
000018EC 


0004 
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move.1 
move.1 
move.1 
moveq 
bsr 
move.1 
moveq 
move.1 
move.1 
moveq 
bsr 
move.1 
move.1 
move.1 
add.l 
movea.1 
moveq 
move .b 
movem. 1 
beq.s 
move.1 
move.1 
move.1 
add.1l 
move.1 
moveq 
add.l 
move.1 
move.1 
move.1 
add.l 
move. 
move. 
move. 
move. 
move. 
move. 
cmp.1 
ble.s 
move.1 


PRP RB 


moveqg 
move.1 
move.1l 
move.1 
add.l 
movea.1 
addq.1 
move.b 
move.1 
add.1 
bra.s 
move.1 
move.1 
addq.1 
addq.1 
cmp.1 


da7,- (a5) 
#flags, -(a5) 
#size,-(a5) 
#1,a7 
$fill 
d7,-(a5) 
#0,a7 
dg7,-(a5) 
#size,-(a5) 
#0,d7 

$do 
d7,-(a5) 
#flags,- (a5) 
ds,da7 

(a5) +,a7 
d7,a0 
#0,d7 

(a0) ,da7 
(a5)+,da7 
sv4 
d7,-(a5) 
d5,da7 
da7,-(a5) 
(a5)+,da7 
da7,-(a5) 
#3,da7 
(a5)+,a7 
d7,-(a5) 
da7,-(a5) 
d5,d7 
(a5)+,da7 
d7,- (a5) 
d7,-(a5) 
#size,d7 
(a5) +,d0 
a7,dl 
(a5)+,da7 
d0,dl 

sv3 
da7,-(a5) 
#0,da7 
a7,-(a5) 
4(a5),-(a5) 
#flags,d7 
(a5)+,d7 
a7,a0 
#3,a5 
(aS) +, (a0) 
4(a5),da7 
(a5) +,da7 
sv2 
(a5)+,d7 
(a5)+,da7 
#1,d7 
#1,d5 
d6,d5 


Disassembled and Hand-Commented 


(begin) 
size 


over 

+ 
repeat 

drop 

drop 
Le 
(then) 


dup 


loop 


121 


122 


6692 
2A1C 
2c1Cc 
6100 
6100 
0650 
6D65 
6000 


D820 
D498 
7269 
7320 
D78E 
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bra 


svl 

(a4) +,d5 

(a4) +,d6 

$dot A 

$dot quote ." Primes " 


Ser cr 


8 


A Forth 
Native-Code 
Cross-Compiler 


Raymond Buvel 


Any computer that runs Forth can create executable 68000 
programs with this cross-compiler. If you have an 8-bit 
microcomputer system and want to experiment with the 

Motorola MC68000, this chapter will interest you. 


ne of the problems you may face in using a new microprocessor is how 
O to develop programs for it. A solution, albeit an expensive one, is to 
purchase a new computer system and a new set of software for each of the 
processors that interest you. Another solution is to use an existing computer 
and its software to develop programs for the new processor. Compilers that 
use an existing computer to produce code for another machine are called cross- 
compilers. 

The cross-compiler presented here (Listing 8.1) is a native-code compiler 
for the MC68000. It loads on top of a host Forth development system (which 
doesn't have to be running on a 68000) and compiles a subset of the Forth 
language. The compiler itself is written in Forth and I have isolated the 
necessary system-dependent parts to a few clearly identified words. The code 
presented here can be easily ported to most Forth environments. 

The compiler produces two types of compiled definitions: macros and 
subroutines. The macro definitions are essentially extensions to the compiler 
itself and do not directly produce an executable program. Subroutine def- 
initions generate the executable code, which can consist of macro, variable, 
constant and subroutine references. The executable code is stand-alone and can 
be placed in ROM. The destination for the executable code is left entirely up 
to the user and can be changed by altering the definition of a single word in 
the compiler. The compiler does not contain any built-in words for doing I/O; 
this is left up to the user. 


124 DR. DOBB'S TOOLBOOK OF 68000 PROGRAMMING 


If you are not familiar with Forth, you will probably want to get a copy of 
Starting Forth by Leo Brodie (Prentice-Hall, 1981). I used this book as a ref- 
erence when I developed the compiler, so the Forth words implemented here 
work as described in the book. I have presented the detailed description of the 
operators supported by the compiler in assembly language. Although you 
need a familiarity with the MC68000 assembly language to understand the 
design of the compiler, it is not necessary to know assembly language to use 
it. 


Memory Layout 

Before considering the compiler, you should know how it uses the 
MC68000 address space. The compiler uses four separate areas of MC68000 
address space. The pointers to these areas all are maintained in MC68000 
registers, so the user can allocate any area of memory for the various func- 
tions by properly setting up the registers. Thus, you need not recompile the 
program to change the memory map. These areas are described below. 

* Code Pool. Subroutine definitions place their output code in the code 
pool and update the code pool pointer variable in the compiler (M68PCODE 
in Listing 8.1). During execution of the resulting code, the MC68000 
program counter is the pointer to this area of memory. Only relative address- 
ing is used, so you can relocate the code pool simply by moving the code and 
starting the program at the proper place. 

¢ Variable Pool. The memory used by variables and arrays is allocated 
relative to the variable pool pointer (a5 in the MC68000). The compiler word 
M68ALLOT is used to allocate space and maintain even address alignment. 
To avoid address faults, the value placed in a5 must be even. With that 
restriction, you may place the variable pool anywhere in memory by setting 
the value of a5. With an appropriate supervisor program, you can produce 
reentrant modules with this compiler by using a5 to assign a separate space 
for the local variables each time the module is called. 

* Data Stack. MC68000 register a6 points to the memory used by the 
data stack. The stack is maintained using the auto decrement addressing mode 
to store information on the stack and the auto increment addressing mode to 
remove information from the stack. For further information on the workings 
of the data stack, see the stack operator section of Listing 8.2. 

* Return Stack. The hardware stack is used for the return stack because 
most of the return stack operations then become automatic. a7 is the pointer 
to the return stack. Since there is both a supervisor and a user hardware stack 
pointer, you can use modules generated by this compiler for both interrupt 
service routines and user programs. 


Compiler Description 

The Forth subset that I chose to implement is for a particular hardware 
configuration, but you could expand it if you require different I/O. I assume 
that you have a computer with a Forth system running on it. In the following 
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discussion, I will refer to this computer as the host. I also assume that you 
are using the MC68000 as a coprocessor. This assumption greatly reduces the 
complexity of the subset to be implemented. All functions that interact with 
the terminal can be left out, and the host can handle all the interaction with 
the operating system and peripherals. 

A simple bidirectional communication channel is all that is required 
between the host and the MC68000. I have chosen to implement the arith- 
metic, stack, memory access and control operators found in most Forth 
systems. You can write I/O routines for data transfer between the MC68000 
and the host using the primitives provided, since the MC68000 uses memory- 
mapped I/O. 

The compiler generates machine code, so there is no need for an inner 
interpreter. Because the MC68000 provides the capability of writing position- 
independent code, all of the code produced by this compiler is position- 
independent unless the user explicitly forces it to be otherwise. Because the 
code is kept separate from the variable and stack space, the output from the 
compiler can be put into ROM. It is sometimes desirable to use programs 
generated with this compiler in host environments other than Forth. 
Therefore, I have provided a simple output scheme that allows you to send the 
output code to any device supported by the host. 

To avoid conflict with the Forth definitions in the host, most of the 
compiler is in a separate vocabulary (named M68K) that is accessed only 
through the defining words. All definitions created with the compiler also are 
placed in this vocabulary to prevent accidental reference to them while using 
the host development system. I have given some of the words normally used 
in Forth different names to avoid conflicts with the host definitions; I will 
discuss these differences later. 

The compiler produces two basic types of definitions: macros and 
subroutines. Macro definitions do not generate any output code, but when 
referenced they store the code that implements the macro in the definition 
currently being compiled. Subroutine definitions generate output code when 
defined and generate a subroutine call when referenced in another subroutine 
definition. A macro definition may be referenced in another macro or in a 
subroutine definition, but a subroutine may not be referenced in a macro 
definition. The macro definition is the basic building block in the compiler, 
so I will discuss it in detail before considering constant, variable and array 
definitions. 


Macros 

You create a macro in the same way that you do a Forth colon definition, 
except that you use :M68MAC and ;M68MAC in place of the : and ;: of a 
Forth definition. The body of the definition consists of executable MC68000 
machine code or references to macros, variables, constants and arrays. For 
examples of macro definitions, see Listing 8.1 and the examples below. 

Defining a macro activates the M68K vocabulary, creates a Forth header in 
the dictionary of the host, and reserves space for the code length. The host 
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Forth is in execution mode, so any words referenced in the body of a macro 
definition are executed immediately. Terminating the macro definition stores 
the length of the code segment and reactivates the Forth vocabulary. Any 
subsequent reference to the macro copies the code contained within the macro 
body into the host dictionary at the location HERE, then the dictionary 
pointer is updated to point to the memory location following the code. 

To illustrate this process, Figure 8.1 shows the definition of the macro 2*. 
First :M68MAC is used to start the definition, create the Forth header for 2* 
(I am being deliberately vague about the form of the header because that 
depends on the particular implementation of Forth being used), and allocate 
space for the code length. Next the macro DUP is called; it copies the code for 
performing a DUP function into the host dictionary at the location HERE and 
updates the dictionary pointer by 2. Then the macro + is called; it copies its 
code into the dictionary and updates the dictionary pointer by 4. Finally 
;M68MAC is used to terminate the definition and compute and store the 
macro length in the 2 bytes following the header. In this case, the length is 6 
bytes. 


Constants and Variables 

Single- and double-precision constants are compiled as macros containing a 
single MC68000 instruction to push the value of the constant onto the data 
stack. For example, the word M68CON is used to define a single-precision 
constant in Figure 8.2. The value of the constant is taken from the host stack 
and stored in the macro as part of a move-immediate instruction (see Listing 
8.2). The word M68DCON is the same, except that it involves a double- 
precision value. 

Variables are defined as single-precision constants that push the variable 
pool relative address onto the stack for use with the fetch and store operations. 
Note that the variable pool relative address is a 16-bit signed integer, so the 
variable pool can be no longer than 32K bytes. The word M68VAR defines a 
single-precision variable; the word M68DVAR is for double precision. 

Arrays are defined as macros that take the index off the stack, compute the 
variable pool relative address of that element, then leave the result on the 
stack. The compiler supports arrays whose elements are either byte, single 
precision or double precision. The words M68CARY, M68ARY and 
M68DARY, respectively, define these data types. Figure 8.3 shows the def- 
inition of a byte array containing five elements. The variable pool pointer is 
shown before and after the definition of the array. Note that the compiler 
maintains alignment on word boundaries to avoid address exceptions when the 
code is executed. 


Subroutines 

Defining a subroutine activates the M68K vocabulary and creates a Forth 
header in the dictionary of the host. The code pool relative address of the 
subroutine is then stored as the first entry of the definition. Compilation 
proceeds in the same way as in a macro definition, except that references to 
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‘M68MAC 2* DUP + ;M68MAC 


Dictionary header for the macro 2* 
Length of macro code segment 
MC68000 machine code for DUP 
MC68000 machine code for + 


FIGURE 8.1 A macro definition and the resulting dictionary entry. 


5 M68CON #5 


(Defines the constant #5) 


Dictionary header for the macro #5 
Length of macro code segment 


MOVE.W #5, -(A6) 
Pushes the value 5 onto data stack 


FIGURE 8.2 A constant definition and the resulting dictionary entry. 


5 M68CARY EXAMP 


Variable pool pointer: 
Before = 0020 
After = 0026 


(Defines the byte array EXAMP) 


Dictionary header for the macro EXAMP 
Length of macro code segment 


Code to add the variable pool relative 
address of the array (20 hex) to the 
index value on the data stack 


FIGURE 8.3 An array definition, the resulting dictionary entry and 
the effect on the variable pool pointer (note all values are in hex). 
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subroutine definitions are also allowed. When the definition is terminated, a 
return from subroutine instruction is compiled, the code pool pointer 
M68PCODE is updated, and the code is sent to the output file and deleted 
from the dictionary. This process leaves only the header and the code pool 
relative address of the subroutine in the dictionary. 

Subsequent reference to the subroutine uses the code pool relative address 
to compute the relative address required in a branch to the subroutine instruc- 
tion. Note that the branch instructions on the MC68000 restrict your program 
to 32K bytes because all of the subroutine calls are back branches; forward 
referencing is not supported in this compiler. 

Since the code pool relative address of a subroutine is stored at the start of 
the definition, you may reference a subroutine recursively. You must exercise 
care when doing this, however. Subroutine calls do not create local variables, 
so a subroutine that stores a value in a variable may not operate properly 
when called recursively. I recommend keeping all variables on the stack in 
recursive subroutines. When you do this, make sure the data stack is large 
enough; because there is no check for stack overflow, something will be 
clobbered if the data stack space is too small. 

Figure 8.4 illustrates the process of subroutine compilation. The word 
:M68K is used to start the definition and create the Forth header for 4*; this 
also sets the code pool relative address of 4* (in this case, it is zero). Next the 
macro 2* is called twice (this macro 2* is the one defined in Figure 8.1, not 
the one actually implemented in the compiler, which is more efficient). Each 
time 2* is called, the code implementing it is sorted in the host dictionary, 
and the dictionary pointer is updated. Then ;M68K is used to terminate the 
definition by compiling a subroutine return instruction, adding the length of 
the subroutine to the code pool pointer, copying the code to the output file, 
and deleting the code from the dictionary. 


;M68K 4* 2* 2* ;M68K 


| 4 | Dictionary header for the subroutine 4* 
Relative address of subroutine 


Code pool pointer: 
Before=0000 
After=000E 


Code sent to the output file: 


2" 2 


RTS instruction 
4E 75 


3D 1630 1ED156! 3D 1630 1E D156 
! 


FIGURE 8.4 A subroutine definition, the resulting dictionary entry 
and the output code. 
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Installation 

To install the compiler, you need a Forth-83 system. Listing 8.1 contains 
the complete source for the compiler; you must somehow get this into your 
Forth system. 

You must customize a few words in the compiler to your system. A word 
called HIGH-BYTE takes the top entry off the stack and returns the high byte. 
You must replace this definition with the code to accomplish this task; it 
may be necessary to use a code definition on your system. The word 
M68OUT in the listing can be designed to send the generated code to whatever 
output device or file you want (refer to the note in the listing). The word 
M68OUT currently prints the code on the screen. If you want to use the 
external reference capability, you will need to modify the definition in to send 
the output where you want it. If you do not want that feature, simply delete 
the definition entirely. 

The compiler is loaded in three sections. The basic compiler and error 
checking routines constitute the first section. They are followed by the pro- 
gram control and looping operations with their associated error checking. The 
last section contains the macros that implement the operators supported by 
the compiler. 


Using the Compiler 

Before explaining how to use the compiler, I want to point out some of 
the difficulties you will encounter. Although this compiler uses a subset of 
Forth, you will find that you must modify most Forth programs before the 
compiler will accept them. Obviously, you must remove and/or replace 
constructions that are not supported by the compiler, but many programming 
techniques used in Forth also will not work because the output code is 
nothing like the indirect threaded code used in most Forth implementations. 
Therefore, such practices as modifying the values of constants on the fly will 
not work, nor can you compile things into the dictionary at runtime since 
there is no dictionary. 

The compiler runs as a collection of words in the host Forth system, but I 
have not rewritten the Forth word NUMBER to be sensitive to the state of 
the M68K compiler. Therefore, a number contained within a definition will 
not be compiled into the definition automatically. Instead it goes on the host 
stack and must be compiled into the definition with the word LITERAL (or 
DLITERAL in the case of double precision). Also note that you must use at 
least one subroutine definition to get the code to the output file. This should 
all become clear as I go through the example in Listing 8.3. 

Listing 8.3, screen 8, shows the Forth implementation of a benchmark 
program (Jim Gilbreath, Byte, September 1981, page 190); I will use this as 
a reference to show how the Forth code must be modified to compile prop- 
erly. Screen 9 shows the same program for the M68K compiler. The first 
thing to notice is the difference in the definitions of the constants. Zero is 
used several times in the program. To avoid using LITERAL so many times, 
I defined a constant #0 in the M68K vocabulary and used that wherever a zero 
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was used. (I could have chosen the name 0 for this constant, since it is re- 
defined only when the M68K vocabulary is active, but that would have made 
the example more confusing.) 

The constant SIZE is used in two ways in the program. SIZE is used as a 
Forth constant when allocating array space and as an M68K constant within 
the definition of DO-PRIME. It is defined twice, once in the Forth vocabulary 
using CONSTANT and once in the M68K vocabulary using M68CON. The 
definition of the variable FLAGS is a little different from screen 8: the M68K 
compiler follows the convention in Starting Forth and does not provide 
initialized variables. Note that the constants 1 and 3 used inside DO-PRIME 
are compiled using LITERAL, as all numbers inside a definition must be if 
they are not defined as constants. The resulting code for either a constant or 
literal is identical; only the compilation process is different. I used 
:M68MAC to define DO-PRIME because I wanted the entire program to be a 
single subroutine. 

Screen 10 contains an initialization routine required to set the pointers used 
by the code; it also contains a subroutine TEST, which causes the output 
code to be generated. TEST also causes the benchmark to be iterated 10 times 
to conform to the requirements of Gilbreath's test. Note that because the 
prime count cannot be printed as in the Forth version, it is simply dropped 
from the stack. You might think that you could eliminate the prime count 
from the program itself, but that would give a false representation of the 
execution time of the benchmark. 

Screen 11 is the same program as screen 9, rewritten to use the array 
features of the compiler and to remove a couple of the inefficiencies built into 
the original program. Screen 12, which is identical to screen 10, is here 
simply to compile screen 11 without using an indirect LOAD. Listing 4 is an 
assembly language version of the benchmark program used for timing and 
code size comparisons. Table 8.1 shows the results of the benchmark. Note 
that we pay a fairly heavy penalty for using a stack-oriented language. The 
programming convenience is worth it in most cases, however. 

At the end of the chapter is a description of the words that perform the 
compilation operations. The list is organized functionally rather than alpha- 
betically. The word described is listed to the left, followed by the host stack 
image; to the right is an example of the proper usage of the word. Listing 8.2 
contains the assembly language source for the Forth words supported by this 
compiler. That listing also includes a short description of the supported 
words. Refer to Listing 8.2 and Starting Forth to clear up any confusion 
concerning these definitions. Note that Listing 8.2 also describes several 


Source Code Size Time (10 iter.) Comments 

Screen 8 109 bytes 85 sec. Z80 Forth @ 3.5 MHz 

Screen 9 220 bytes 8.7 sec. 68000 @ 10 MHz with 2 wait states 
Screen 11 216 bytes 8.1 sec. 68000 @ 10 MHz with 2 wait states 
Assembler 74 bytes 2.1 sec. 68000 @ 10 MHz with 2 wait states 


TABLE 8.1 Aesults of benchmark tests shown in Listing 8.3. 
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operators that are not standard Forth. Operators for doing absolute memory 
references are described in the memory and I/O section. Absolute subroutine 
calls and jumps are described in the control operations section. 


Debugging Your Programs 

There is no easy way to debug programs written for this compiler, so I 
recommend the following development procedure: 

1. Write and debug the program in Forth. Your Forth development system 
provides a good environment for debugging programs that you develop for 
this compiler. It is very important at this stage to avoid using any operations 
not supported by the compiler, since you will just have to remove them later. 
You should avoid using embedded literals in your definitions—they are rather 
messy to take care of later. 

2. Translate the program into a form acceptable to the compiler. If you 
have avoided using embedded literals in your definitions, the only changes 
should be replacing the : and ; as appropriate. You can take care of the em- 
bedded literals by defining each one as a constant or using the word LITERAL 
(or DLITERAL). 

3. Compile the program and load it into your MC68000 computer. Make 
sure that registers a5, a6 and a7 are set properly (either by a supervisor 
program, by hand, or by using the load instructions provided in the compiler). 

The program should now operate properly. If not, step 2 is the most likely 
place to find the errors. I have frequently encountered errors in resolving 
program control structures, errors that occur because an embedded literal has 
not been compiled. This type of error is picked up in the compile phase, so 
incorrect code is not produced. 


Final Comments 

This compiler could form the basis for a modular programming envi- 
ronment for the MC68000 microprocessor. The modifications necessary 
would not be very complicated. To make the compiler generate modular code, 
you would have to add a new word to the basic compiler. This word should 
reset all of the compiler variables to their original state and generate an 
appropriate header to permit the operating system to load and execute the 
module. Another word to terminate a module and check for errors would be 
required; the error checking code in the existing compiler can be used as a 
guide. Note that the absolute addressing operators must be used for all global 
variables. I recommend that global variables be avoided and that all parameters 
passed from one module to another be passed on the data stack. 

I would like to see floating-point arithmetic added to the compiler, but I 
am unlikely to do it anytime soon. I would also like to have a MC68000 
assembler built into the compiler, so machine code does not have to be typed 
in hex. The Forth assembler developed by Michael A. Perry might be adapted 
for this purpose. [See Chapter 9, "A 68000 Forth Assembler" by Michael A. 


Perry.] 
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Description of Compiling Words 


:M68K (--) :M68K xxxx 

Creates a header for the subroutine word xxxx in the M58K vocabulary and 
sets the variables M68ENTRY and M68PFA. Reference to xxxx within a 
subroutine definition generates a branch to subroutine using the PC-relative 
addressing mode. The word xxxx may only be referenced within a subroutine 
definition. Any other usage will produce an error message. So long as no side 
effects occur, the word xxxx may be referenced recursively. Note that the code 
may be put into ROM because the stack and variable space are kept separate 
from the code. 


sM68K (--) 

Terminates the construction of a subroutine definition and sends the code 
to the output file. The code for the subroutine is deleted from the host 
dictionary after it is written out, and only the code pool relative address of the 
subroutine is retained. 


s:M68MAC (--) sM68MAC xxxx 

Creates a header for the macro word xxxx in the M68K vocabulary and sets 
the compiler variable M68PFA. Reference to xxxx within a definition copies 
the compiled code into the host dictionary. A macro word may be referenced 
within the definition of another macro word or within the definition of a 
subroutine word. 


;M68MAC (--) 
Terminates the construction of a macro type word, encloses the code and 
updates the compiler variables. 


M68CON (n--) n M68CON xxxx 

Defines a macro word xxxx that pushes the value n onto the stack when 
XXxX is executed. The value n must be on the host stack when M68CON is 
referenced. 


M68DCON (d--) d M68DCON xxxx 
Defines a double-precision constant in the same way as M68CON. 


M68ALLOT (n--) n M68ALLOT 

Allocates n bytes of space in the variable pool by updating the variable 
pool pointer variable M68PVAR. Note that this word maintains even-byte 
alignment so that address exceptions will not occur on 16- and 32-bit memory 
references. Also note that it is an error if the variable pool becomes longer 
than 32K bytes, but the compiler will not report this as an error—incorrect 
code would be generated. 
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M68VAR (--) M68VAR xxxx 

Defines a single-precision variable by using the pointer M68PVAR as the 
parameter n for M68CON. The pointer M68PVAR is then updated by 2 bytes 
using M68ALLOT. Execution of the word xxxx leaves the variable pool 
relative address of the variable on the top of the stack. 


M68DVAR (--) M68DVAR xxxx 
Defines a double-precision variable in the same way as M68VAR except 
that the pointer M68PVAR is updated by 4 bytes. 


M68ARY (n--) n M68ARY xxxx 

Defines a single-precision array xxxx that is n elements long. The word 
XXXX is defined as a macro that takes an element number off the top of the 
stack and leaves the variable pool relative address of that element. 


M68CARY (n--) n M68CARY xxxx 
Defines a byte array in the same manner as M68ARY. 


M68DARY (n--) n M68DARY xxxx 
Defines a double-precision array in the same manner as M68ARY. 


EXTERNAL (--) EXTERNAL xxxx 

Creates an external reference xxxx that contains the code pool relative 
address of the last subroutine word that was defined. The user can customize 
this word to produce a file containing an external reference list. Currently, 
this word creates a constant xxxx in the Forth vocabulary of the host. 
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Listing 8.1 


( M68K Cross compiler -- Copyright Notice ) 
FORTH based cross compiler for the Motorola 68000 microprocessor 


Copyright 1983 by Raymond L. Buvel 
319 Palos Verdes Blvd. #105 
Redondo Beach, CA 90277 


All rights reserved except as stated below. 


This compiler may be distributed to anyone provided this copyright notice 
is included and the distribution is not for profit. Contact me concerning 
royalties for commercial distribution. There is no royalty on code 
produced with this compiler provided the compiler itself is not SOLD as an 
integral part of a software package. 


( M68000 Compiler modifications ) 


6 Jul 86 -- Compiler updated so that it will compile on the 
public domain F83 FORTH development system. These 
modifications should work for FORTH 83 standard 
systems (with appropriate modifications for 
nonstandard words mentioned in the code.) 


( M68000 Compiler load module ) 

CR .( Loading compiling words) CR 

8 LOAD DECIMAL ( Load the compiling words ) 
CR .( Loading control structures) CR 

25 LOAD DECIMAL ( Load the control structures ) 
CR .( Loading macro definitions) CR 

34 LOAD DECIMAL ( Load the macro definitions ) 
FORTH DEFINITIONS 


( M68K Cross Compiler -- Vocabulary definition ) 
VOCABULARY M68K IMMEDIATE 

M68K DEFINITIONS 

HEX 

--> 

Note.. the compilation words listed below are contained in 
the FORTH vocabulary and cause entries to be made in the 
M68K vocabulary. 


:M68K :M68MAC M68VAR M68DVAR M68CON M68DCON 
M68ARY M68DARY M68CARY 


( M68K Cross Compiler -- Variable definitions ) 

M68K DEFINITIONS 

( Code pointer in M68000 -- note relative addressing ! ) 
VARIABLE M68PCODE 0 M68PCODE ! 

( Variable pool pointer in M68000 -- relative to AS ) 
VARIABLE M68PVAR 0 M68PVAR ! 

( Entry point of the subroutine being defined ) 

VARIABLE M68ENTRY 

( Parameter field address [ in HOST ] of word being defined ) 
VARIABLE M68PFA 

--> 
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( M68K Cross Compiler -- Variable definitions ) 

M68K DEFINITIONS 

( Error checking variables ) 

VARIABLE M68?MAC ( True if in a MACRO definition ) 
VARIABLE M68K? ( True if in a SUBROUTINE definition ) 
VARIABLE M68?PAIRS ( Count of incomplete branching ops. ) 
O M68?MAC ! 0 M68K? ! 0 M68?PAIRS ! 

--> 


( M68K Cross Compiler -- Error checking ) 
M68K DEFINITIONS 
: ?M68PAIRS ( Check for unbalanced control structures ) 
M68?PAIRS @ IF 
-" Error! unbalanced control structure " 
0 M68?PAIRS ! ABORT THEN ; 
?M68K ( Check for errors in compiling a subroutine ) 
M68K? @ O= IF ( Check if compiling a subroutine ) 
-" Error! not compiling a SUBROUTINE " 
ABORT THEN ; 
: ?M68MAC ( Check for errors in compiling a macro ) 
M68?MAC @ 0= IF ( Check if compiling a macro ) 
." Error! not compiling a MACRO " 
ABORT THEN ; 
--> 


( M68K Cross Compiler -- Compile constants ) 
M68K DEFINITIONS 
Qe @ 
: HIGH-BYTE FLIP ; ( Leave high byte of n on stack 
le | 
: $CON DUP HIGH-BYTE C, C, ; ( Compile const high-byte first ) 
G ch eeey) 
: S$DCON SCON ( Compile high word ) 

SCON ; ( Compile low word ) 
--> 
Note.. to transport the compiler to other FORTH systems the 
word HIGH-BYTE must be written so that it takes the number off 
the top of the stack and leaves the high byte of that number. 
On some FORTH systems HIGH-BYTE may have to be a CODE 
definition. 


( M68K Cross Compiler -- Compiling Words ) 
M68K DEFINITIONS HEX 
( address -- ) 
: M68MAC ( Compile MACRO code into any definition ) 
DUP @ SWAP 2+ OVER HERE SWAP CMOVE ALLOT ; 
: M68SUB ( Compile SUBROUTINE code into subroutine definition ) 
?M68K 61 C, 00 C, ( BSR addr ) 
HERE M68PFA @ 2+ - ( Compute code length 
MO8ENTRY @ + SWAP @ SWAP - ( Compute displacement ) 
SCON ; ( Compile displacement 
--> 
Note.. the memory image of a MACRO to be compiled is: 
addr Number of bytes of code to compile 
addr+2 Bytes of code to be compiled. 
The memory image of a SUBROUTINE to be compiled is: 
addr Address of subroutine relative to start of code 
( M68K Cross Compiler -- MACRO Compiling Words ) 
FORTH DEFINITIONS 
{ Create header and set compiler variables ) 
:M68MAC ( Begin a MACRO definition ) 
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[COMPILE] M68K DEFINITIONS 
M68K 1 M68?MAC ! CREATE HERE M68PFA ! 
0, ( Initialize the number of bytes field ) 
DOES> M68MAC ; 
M68K DEFINITIONS 
: ;M68MAC ( terminate a MACRO type definition ) 
?M68PAIRS ?M68MAC 0 M68?MAC ! ( Error check & reset ) 


HERE M68PFA @ 2+ - ( Compute code length ) 
M68PFA @ ! ( Store in length field ) 
[COMPILE] FORTH DEFINITIONS ; 

--> 

( M68K Cross Compiler -- Compiling words - constants ) 


FORTH DEFINITIONS HEX 

: M68CON ( Define a single precision constant ) 
:M68MAC 3D C, 3C C, ( MOVE.W #const,-[A6] ) 
M68K SCON ( Compile constant ) 
7M68MAC ; 

FORTH DEFINITIONS 

: M68DCON ( Define a double precision constant ) 
:M68MAC 2D C, 3C C, ( MOVE.L #const,-[A6] ) 
M68K S$DCON ( Compile double constant ) 


7M68MAC ; 
--> 
({ M68K Cross Compiler -- Compiling words - variables ) 
FORTH DEFINITIONS 


(,3-S=")) 

: M68ALLOT ( Allot n-bytes in variable pool ) 
DUP 1 AND IF 1+ THEN ( Byte allign ) 
M68K M68PVAR +! ; ( Update pointer ) 

FORTH DEFINITIONS 

: M68VAR ( Define a single precision variable ) 
M68K M68PVAR @ 2 M68ALLOT ( Get and update pointer ) 
M68CON ( Define the address as a constant ) ; 

FORTH DEFINITIONS 

: M68DVAR ( Define a double precision variable ) 
M68K M68PVAR @ 4 M68ALLOT ( Get and update pointer ) 
M68CON ( Define the address as a constant ) ; 

--> 


( M68K Cross Compiler -- SUBROUTINE Compiling Words ) 
FORTH DEFINITIONS 
( Create header and set compiler variables ) 
: :M68K ( Begin a SUBROUTINE definition ) 

[COMPILE] M68K DEFINITIONS 

M68K 1 M68K? ! ( Set to compiling ) 

CREATE HERE M68PFA ! M68PCODE @ DUP 

M68ENTRY ! , ( Store subroutine address ) 

DOES> M68SUB ; 
--> 
Note.. a SUBROUTINE definition may call itself if there are 
no side effects. This means that all data altered by the 
defined word should be on the stack, not stored in variables. 
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( M68K Cross Compiler -- Code output ) 
M68K DEFINITIONS 
( byte to be sent to code output file -- ) 
: M680UT ( Link to the code output file ) 

BASE @ >R HEX . CR R> BASE ! ; 
--> 
Note.. the code in the above definition should be replaced 
with the appropriate words to send the compiler output to the 
code file of your choice. This could be a disk file, a tape, 
your MC68000 computer, a printer, or any other output sink you 
may want to use. The protocall is determined by your output 
word. The compiler does not assume any protocall so it is a 
general purpose tool for generating MC68000 code. 


( M68K Cross Compiler -- SUBROUTINE Compiling Words ) 
M68K DEFINITIONS HEX 
: ;M68K ( Terminate a SUBROUTINE definition ) 
?M68PAIRS ?M68K 0 M68K? ! ( Error check & reset ) 
4E C, 75 C, ( Compile an RTS instruction 
HERE M68PFA @ 2+ - ( Compute code length 
DUP M68PCODE +! ( Update code pointer ) 
M68PFA @ 2+ ( Start of compiled code ) 


SWAP 0 DO 

DUP C@ M680UT 1+ ( Output a byte of code ) 
LOOP DROP 
M68PFA @ 2+ DP ! ( Delete code from dictionary ) 


[COMPILE] FORTH DEFINITIONS ; 


( M68K Cross Compiler -- EXTERNAL ) 
M68K DEFINITIONS 
SEXTERNAL ( Define entry point as a constant in FORTH voc. ) 
[COMPILE] FORTH DEFINITIONS 
M68ENTRY @ CONSTANT ; 
FORTH DEFINITIONS 
: EXTERNAL ( Compile an external reference ) 
M68K M68K? @ M68?MAC @ OR 
IF ." Can't use EXTERNAL while compiling” 
CR ABORT THEN 
SEXTERNAL ; 
--> 
Note.. to send the external reference list somewhere else, 
replace SEXTERNAL with the appropriate word. Make sure its 
function is equivalent to the above, i.e. it must take the 
next word in the input stream as the identifier. 
( M68K Cross Compiler -- Words - literals ) 
M68K DEFINITIONS HEX 
: LITERAL ( Define a single precision literal ) 
3D C, 3C C, ( MOVE.W #const,-[A6] ) 
SCON ; ( Compile constant ) 
DLITERAL ( Define a double precision literal ) 
2D C, 3C C, ( MOVE.L #const,-[A6] ) 
SDCON ; ( Compile double constant ) 
: BYTES BASE @ >R- HEX 
Q DO BL WORD NUMBER DROP C, LOOP 
R> BASE ! ; 
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--> 

Note.. Used as n BYTES followed by bytes to be compiled into 
the HOST dictionary. This word may be used within a :M68K 
or :M68MAC definition but NOT within a colon definition. 


( M68K Cross Compiler -- Compiling words - arrays ) 
M68K DEFINITIONS HEX 
( adr -- ) 
SM68ARY ( Define code for a single precision array ) 
:M68MAC 30 C, 3C C, ( MOVE.W #const,DO ) 
SCON ( Compile address ) 
DO Cc, 56 C, D1 Cc, 56 -¢, 
*M68MAC ; 
( adr -- ) 
: $M68DARY ( Define code for a double precision array ) 
:M68MAC 30 C, 3C C, ( MOVE.W #const,DO ) 
SCON ( Compile address ) 
32 ¢, L6 €, ES.C, 41C, DO C, 41.:¢,. 3C C, 80 C, 


*M68MAC ; 
--> 
( M68K Cross Compiler -- Compiling words - arrays ) 
FORTH DEFINITIONS 
{8 ==] 


: MO8ARY ( Define a single precision array n cells long ) 
M68K M68PVAR @ ( Get base address ) 
SM68ARY ( Define the referencing code ) 
2* M68ALLOT ( Update variable pointer ) ; 

FORTH DEFINITIONS 

C poo 

: M68DARY ( Define a double precision array n cells long ) 
M68K M68PVAR @ ( Get base address ) 
SM68DARY ( Define the referencing code ) 
4 * M68ALLOT ( Update variable pointer ) ; 


( M68K Cross Compiler -- Compiling words - arrays ) 

M68K DEFINITIONS HEX 

( adr -- ) 

$M68CARY ( Define code for a byte array ) 
sM68MAC 30 C, 3C C, ( MOVE.W #const,DO ) 
SCON ( Compile address ) 
D1 C, 56 C, ;M68MAC ; 

FORTH DEFINITIONS 

(n-- ) 

: M68CARY ( Define a byte array n cells long ) 
M68K M68PVAR @ ( Get base address ) 
SM68CARY ( Define the referencing code ) 
M68ALLOT ; ( Update variable pointer ) 


( M68K Cross Compiler -- Control error checking ) 

M68K DEFINITIONS HEX 

Error checking codes ) 

CONSTANT SECD-IF 

CONSTANT SECD-BEGIN 

CONSTANT $ECD-DO 

CONSTANT S$ECD-WHILE 

SERR-?PAIRS ( Abort if no control structure is started ) 


B WH RAS 
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M68?PAIRS @ 0= 
IF ." No control structure! " ABORT CR THEN ; 
SERR-ABT ( Complete error message and abort ) 
." expected " CR ABORT ; 
: SERR-IF ( Abort if no IF structure ) 
SERR-?PAIRS $ECD-IF - 
IF ." IF structure " SERR-ABT THEN ; 
--> 
( M68K Cross Compiler -- Control error checking ) 
SERR-BEGIN ( Abort if no BEGIN structure ) 
SERR-?PAIRS $ECD-BEGIN - 
IF ." BEGIN structure " $ERR-ABT THEN ; 
: SERR-DO ( Abort if no DO structure ) 
SERR-?PAIRS $ECD-DO - 
IF ." DO structure " SERR-ABT THEN ; 
SERR-WHILE ( Abort if no WHILE structure ) 
SERR-?PAIRS SECD-WHILE — 
IF ." WHILE structure " S$ERR-ABT THEN ; 


( M68K Cross Compiler -- Control structures ) 

( adr -- ) 

: SFOR-RES ( Resolve a foreward branch ) 
HERE OVER - ( Compute relative address ) 
SWAP OVER HIGH-BYTE OVER C! ( Store high byte ) 
1+ C! ; ( Store low byte ) 


¢ 6ar ==: ) 
: $BAK-RES ( Resolve a back branch ) 
HERE - ( Compute relative address ) 
SCON ; ( Compile address ) 
--> 
( M68K Cross Compiler -- Control structures ) 
( -- adr ecd ) 
: IEF ( Compile IF structure, leave address to be resolved ) 
( and an error checking code ) 
aA Ct, SEC, 67 C, 00 -C, 
HERE SECD-IF 1 M68?PAIRS +! 
0, ; ( Leave space for branch address ) 
: ELSE ( Compile an ELSE structure ) 


SERR-IF 60 C, 00 C, 
HERE SWAP ( Save current location and get IF adr ) 
0, ( Leave space for branch address ) 
SFOR-RES ( Resolve IF branch ) 
SECD-IF ; 
: THEN ( Resolve an IF structure ) 
SERR-IF $FOR-RES -1 M68?PAIRS +! ; 
--> 
( M68K Cross Compiler -- Control structures ) 
( -- adr ecd ) 
: BEGIN ( Compile a BEGIN structure ) 
HERE SECD-BEGIN 1 M68?PAIRS +! ; 
: UNTIL ( Resolve BEGIN .. UNTIL loop ) 
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SERR-BEGIN 4A C, 5E C, 67 C, 00 C, 
SBAK-RES ( Resolve BEGIN branch ) 
-1 M68?PAIRS +! ; 

: AGAIN ( Resolve BEGIN .. AGAIN loop ) 
SERR-BEGIN 60 C, 00 C, 
SBAK-RES ( Resolve BEGIN branch ) 
-1 M68?PAIRS +! ; 


( M68K Cross Compiler -- Control structures ) 
: WHILE ( Compile WHILE section of loop ) 
DUP SERR-BEGIN 4A C, 5E C, 67 C, 00 C, 
HERE SECD-WHILE 0 , ; ( Leave space for address ) 
: REPEAT ( Resolve BEGIN .. WHILE .. REPEAT loop ) 
SERR-WHILE SWAP S$ERR-BEGIN 
60 C, 00 C, ( Code for back branch ) 
SWAP $BAK-RES ( Resolve BEGIN branch ) 
SFOR-RES ( Resolve WHILE branch ) 
-1 M68?PAIRS +! ; 


--> 
( M68K Cross Compiler -- Control structures ) 
: DO ( Compile a DO structure 


2F Cc, 1E C, 
HERE S$ECD-DO 1 M68?PAIRS +! ; 
LOOP ( Terminate a DO .. LOOP ) 
SERR-DO 52 C, 57 C, 4C C, 97 C, 00 C, 03 C, 
BO C, 41 CC, 6D CC, 00 Cy 
SBAK-RES ( Resolve DO branch ) 
58 C, 8F C, ( Drop index and limit ) 
-1 M68?PAIRS +! ; 


( M68K Cross Compiler -- Control structures ) 
+LOOP ( Terminate a DO .. +LOOP ) 

SERR-DO 30 C, 1E C, D1 C, 57 C, 4C C, 97, 
00 c, 06 C, 4A C, 40 C, 6E C, 04 C, B4cC, 41 C, 
60 C, 02 C,, B2 Cc, 42 C, ‘6D: Cc, 00 C, 
SBAK-RES ( Resolve DO branch ) 
58 C, 8F C, ( Drop index and limit ) 
-1 M68?PAIRS +! ; 


:M68MAC LEAVE 4 BYTES 3F 57 00 02 ;M68MAC 
--> 


( M68K Cross Compiler -- Control structures ) 
:M68MAC JSR.W 4 BYTES 30 SE 4E 90 ;M68MAC 


:M68MAC 
:M68MAC 
:M68MAC 
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JSR.L 4 BYTES 20 5E 4E 90 ;M68MAC 
JMP.W 4 BYTES 30 SE 4E DO ;M68MAC 
JMP.L 4 BYTES 20 5E 4E DO ;M68MAC 


( M68K Cross Compiler -- Initialization words ) 
M68K DEFINITIONS HEX 


(ids ) 


: ASLD 


: A6LD 


: A7LD 


--> 


( Load variable pool pointer ) 
2A C, 7C C, $DCON ; 

( Load data stack pointer ) 

2C C, 7C C, $DCON ; 

( Load return stack pointer ) 
2E C, 7C C, $DCON ; 


Note.. to create true modular programs there should be an 
operating system that loads the appropriate registers and then 


calls the module. 


In that case these words should be discarded 


since the address is determined at compile time instead of run 


time. 


( M68K Cross Compiler -- Arithmetic words ) 


HEX 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 


:M68MAC 


:M68MAC 


--> 


+ 4 BYTES 30 1E Dl 56 ;M68MAC 

- 4 BYTES 30 1E 91 56 ;M68MAC 

* 6 BYTES 30 1E Cl D6 3C 80 ;M68MAC 

/ 8 BYTES 4C 9E 00 03 83 CO 3D 01 ;M68MAC 
D+ 4 BYTES 20 1E D1 96 ;M68MAC 

D- 4 BYTES 20 1E 91 96 ;M68MAC 


*/ OA BYTES 32 1E 30 1E Cl D6 81 C1 3C 80 ;M6é8MAC 


/MOD 8 BYTES 42 80 32 1E 30 1E 80 Cl 
4 BYTES 48 40 2D 00 ;M68MAC 
MOD 8 BYTES 42 80 32 1E 30 1E 80 C1 
4 BYTES 48 40 3D 00 ;M68MAC 
*/MOD 8 BYTES 32 1E 30 1E CO DE 80 Cl 
4 BYTES 48 40 2D 00 ;M68MAC 


( M68K Cross Compiler -- Arithmetic words ) 


:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
--> 


U* 6 BYTES 30 1E CO DE 2D 00 ;M68MAC 
U/MOD OA BYTES 32 1E 20 1E 80 Cl 48 40 2D 00 
1+ 2 BYTES 52 56 ;M68MAC 
2 BYTES 53 56 ;M68MAC 
2 BYTES 54 56 ;M68MAC 
2- 2 BYTES 55 56 ;M68MAC 
2* 2 BYTES El D6 ;M68MAC 
2/ 2 BYTES EO D6 ;M68MAC 
NEGATE 2 BYTES 44 56 ;M68MAC 
MINUS NEGATE ;M68MAC 
DNEGATE 2 BYTES 44 96 ;M68MAC 
DMINUS DNEGATE ;M68MAC 
ABS 6 BYTES 4A 56 6C 02 44 56 ;M68MAC 
DABS 6 BYTES 4A 96 6C 02 44 96 ;M68MAC 


( M68K Cross Compiler -- Stack manipulation ) 


:M68MAC 
:M68MAC 
:M68MAC 
>M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 


DROP 2 BYTES 54 8E ;M68MAC 

2DROP 2 BYTES 58 8E ;M68MAC 

DUP 2 BYTES 3D 16 ;M68MAC 

2DUP 2 BYTES 2D 16 ;M68MAC 

SWAP 6 BYTES 20 16 48 40 2C 80 ;M68MAC 
2SWAP OA BYTES 20 16 2C AE 00 04 2D 40 00 04 
OVER 4 BYTES 3D 2E 00 02 ;M68MAC 


;M68MAC 


+M68MAC 
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:M68MAC 2O0VER 4 BYTES 2D 2E 00 04 ;M68MAC 
=M68MAC >R 2 BYTES 3F 1E ;M68MAC 

:=M68MAC R> 2 BYTES 3D 1F ;M68MAC 

=M68MAC I 2 BYTES 3D 17 ;M68MAC 

:M68MAC I' 4 BYTES 3D 2F 00 02 ;M68MAC 
:M68MAC J 4 BYTES 3D 2F 00 04 ;M68MAC 

--> 


( M68K Cross Compiler -- Comparison operations ) 
:M68MAC = 6 BYTES 30 1E 32 1E B2 40 

8 BYTES 57 CO 02 40 00 01 3D 00 ;M68MAC 
:M68MAC < 6 BYTES 30 1E 32 1E B2 40 

8 BYTES 5D CO 02 40 00 01 3D 00 ;M68MAC 
:M68MAC > 6 BYTES 30 1E 32 1B B2 40 


8 BYTES 5E CO 02 40 00 01 3D 00 ;M68MAC 
:M68MAC MIN 6 BYTES 30 1E 32 16 BO 41 
6 BYTES 6F 02 Cl 41 3C 80 ;M68MAC 
:M68MAC MAX 6 BYTES 30 1E 32 16 BO 41 


6 BYTES 6C 02 Cl 41 3C 80 ;M68MAC 
--> 


( M68K Cross Compiler -- Comparison operations ) 
:M68MAC D= 6 BYTES 20 1E 22 1E B2 80 

8 BYTES 57 CO 02 40 00 01 3D 00 ;M68MAC 
:M68MAC D< 6 BYTES 20 1E 22 1E B2 80 

8 BYTES 5D CO 02 40 00 01 3D 00 ;M68MAC 

6 BYTES 20 1E 22 1E B2 80 

8 BYTES 5E CO 02 40 00 01 3D 00 ;M68MAC 


:M68MAC D> 


--> 


( M68K Cross Compiler -- Comparison operations ) 
:M68MAC 0= 2 BYTES 4A 5E 

8 BYTES 57 CO 02 40 00 01 3D 00 ;M68MAC 
:M68MAC NOT 0= ;M68MAC 
=M68MAC 0< 2 BYTES 4A 5E 

8 BYTES 5D CO 02 40 00 01 3D 00 ;M68MAC 
:M68MAC 0> 2 BYTES 4A 5E 

8 BYTES 5E CO 02 40 00 01 3D 00 ;M68MAC 


:M68MAC DO= 2 BYTES 4A 9E 

8 BYTES 57 CO 02 40 00 01 3D 00 ;M68MAC 
:M68MAC DO< 2 BYTES 4A 9E 

8 BYTES 5D CO 02 40 00 01 3D 00 ;M68MAC 
:M68MAC DO> 2 BYTES 4A 9E 

8 BYTES 5E CO 02 40 00 01 3D 00 ;M68MAC 
--> 
( M68K Cross Compiler -- Comparison operations ) 


:M68MAC AND 4 BYTES 30 1E Cl 56 ;M68MAC 
:M68MAC OR 4 BYTES 30 1E 81 56 ;M68MAC 
=M68MAC XOR 4 BYTES 30 1E Bl 56 ;M68MAC 
:M68MAC 1'S 2 BYTES 46 56 ;M68MAC 

--> 


( M68K Cross Compiler -- Memory and I/O operations ) 
:M68MAC ! 6 BYTES 30 1E 3B 9E 00 00 ;M68MAC 
:M68MAC @ 6 BYTES 30 16 3C B5 00 00 ;M68MAC 


:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
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2! 6 BYTES 30 1E 2B 9E 0 
2@ 6 BYTES 30 1E 2D 35 0 
+! 8 BYTES 30 1E 32 1E D 
C! 8 BYTES 30 1E 32 1E 1 
C@ OA BYTES 30 16 42 41 


0 00 ;M68MAC 

0 00 ;M68MAC 

3.75 00 00 ;M68MAC 

B 81 00 00 ;M68MAC 

12 35 00 00 3C 81 ;M6é8MAC 


:M68MAC FILL 8 BYTES 30 1E 32 1E 30 5E D1 CD 


--> 


8 BYTES 60 02 10 CO 


( M68K Cross Compiler -- Memory 


:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 
:M68MAC 


B> 


AW! 4 BYTES 30 5E 30 9E 
AW@ 4 BYTES 30 56 3C 90 
AL! 4 BYTES 20 5E 30 9E 


AL@ 4 BYTES 20 5E 3D 10 

CAW! 6 BYTES 30 5E 30 1E 
CAW@ 8 BYTES 30 56 42 40 
CAL! 6 BYTES 20 5E 30 1E 
CAL@ 8 BYTES 20 5E 42 40 
2AW! 4 BYTES 30 5E 20 9E 
2AW@ 4 BYTES 30 5E 2D 10 
2AL! 4 BYTES 20 5E 20 9B 
2AL@ 4 BYTES 20 56 2C 90 
AFILL 8 BYTES 30 1E 32 1 


6 BYTES 10 CO 51 C 


Listing 8.2 


301E 
D156 


51 C9 FF FC ;M68MAC 


and I/O operations ) 
7M68MAC 
7M68MAC 
;M68MAC 
7M68MAC 
10 80 ;M68MAC 
10 10 3C 80 ;M68MAC 
10 80 ;M6é8MAC 
10 10 3D 00 ;M68MAC 
7M68MAC 
7M68MAC 
7M68MAC 
*M68MAC 
E 20 5E 60 02 
9 FF FC ;M68MAC 
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; This file contains the assembler language code for all of the 
# operations in the M68K compiler. 


+ Register usage.... 


p A7 - Hardware and return stack pointer 

nl A6 - Data stack pointer 

Ri A5 - Pointer to variable pool 

. A4 - Reserved for future use 

; 

: All other registers are free to be used by any word that needs them 
* and are to be considered as altered across word boundaries. 


Arithmetic operati 


ORI II ICI IO I IO II IOI IG IG IO IO IO IO IO IOI II IOI I IOI I IOI IOI I I IOI I ak ke ak 


, oS ( nl 
MOVE.W (A6)+,D0 
ADD.W DO, (A6) 


‘ 


ons 


n2 -- sum ) 


7Get n2 
snl + n2 


FRR IK IOI III III I IOI II I III II I IOI IOI OO I IOI tke ak 


144 


301E 
9156 


301E 
C1D6 
3C80 


4C9E 
83C0 
3D01 


321E 
301E 
C1D6 
81C1 
3c80 


4280 
321E 
301E 
80Cl 
4840 
2D00 


4280 
321E 
301E 
80Cc1 
4840 
3D00 


321E 
301E 
CODE 
80Cl 
4840 
2D00 


301E 
CODE 
2D00 


321E 


0003 


7 


7 


7 
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MOVE.W 
SUB.W 


*/ 


( nl n2 -- 


(A6)+,D0 
DO, (A6) 


(nl n2 == 
(A6)+,D0 
(A6) ,DO 
DO, (A6) 


(nl n2 -- 


(A6) +,D0/D1 


dif ) 


prod ) 


quot ) 


nl-n2 


7Get n2 
wml = n2 


7Get n2 
7n2 * nl 


nl/n2 


7Get operands sign extended 


DO,D1 7nl/n2 
D1,-(A6) 
( nl n2 n3 -- n-result ) nl*n2/n3 
(A6)+,D1 7Get n3 
(A6)+,D0 7Get n2 
(A6) , D0 ;n2*nl -> DO 
D1,D0 7n2*n1/n3 -> DO 
DO, (A6) 
( ul u2 -- u-rem u-quot ) 
DO 
(A6)+,D1 7;Get u2 
(A6)+,D0 7Get ul 
D1,D0 7ul/u2 
DO ;Interchange remainder and quotient 
DO, - (A6) ;Return both on stack 
( ul u2 -- u-rem ) 
DO 
(A6)+,D1 7Get u2 
(A6)+,D0 7Get ul 
D1,D0 7ul/u2 
DO ;Interchange remainder and quotient 
DO, - (A6) ;Return remainder on stack 
( ul u2 u3 -- u-rem u-result ) ul*u2/u3 
(A6)+,D1 7Get u3 
(A6)+,D0 7Get u2 
(A6)+,D0 7u2*ul -> DO 
D1,D0 7u2*ul/u3 -> DO 
DO 
DO, - (A6) 
( ul u2 -- ud ) 
(A6)+,D0 
(A6)+,D0 
DO, - (A6) 
( ud ul -- u-rem u-quot ) 


(A6)+,D1 


7Get ul 


201E 
80Ccl 
4840 
2D00 


5256 


5356 


5456 


5556 


E1D6 


EOD6 


4A56 
6C02 
4456 


4096 
6C02 
4496 


4456 


4496 


201E 
D196 


201E 
9196 
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wD 
Q 
ie] 
n 


DABS 


’ 


; 


NEGATE 


NEG.W 


DNEGATE 


NEG.L 


“L 


(A6) +,D0 7Get ud 
D1,D0 pud/ul 
DO 
DO, - (A6) 
( n -- ntl ) 
#1, (A6) 
( n,-= MeL») 
#1, (A6) 
( n == n+#2 ) 
#2, (A6) 
( m= ne?) 
#2, (A6) 


( n -- n*2 ) 


( m- nf2 ) 


(A6) 7Test for negative 
ABS 7Skip next instruction if not 


( d -- abs(d) ) 


(A6) 7Test for negative 
DABS 7Skip next instruction if not 


(A6) 


(A6) 
( dl d2 -- d-sum ) 


(A6)+,D0 
DO, (A6) 


( dl d2 -- d-diff ) dl-d2 


(A6)+,D0 
DO, (A6) 
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7 
DIOS III ICICI OIG ICICI OCI TOIT IIA II TOI IR III II IK 


; Stack manipulation 


OGIO III III IOI OIC ICICI OTS IOI I I I IOI IATA IA IAI IAI II KIKI 


7 DROP ¢ mi, HS 3 
548E ADDQ.L #2,A6 
2DROP i{ad-- } 
588E ADDQ.L #4,A6 
SWAP 
2016 MOVE.L (A6),D0 
4840 SWAP DO 
2C80 MOVE.L DO, (A6) 
: 2SWAP ( di dZ -- dz di } 
2016 MOVE.L (A6),DO 
2CAE 0004 MOVE.L 4(A6), (A6) 
2D40 0004 MOVE.L DO, 4 (A6) 
DUP {a--n mw } 
3D16 MOVE.W (A6),~-(A6) 
2DUP t(a@=--dda) 
2D16 MOVE.L (A6),-(A6) 
: OVER (nl n2 -- nl n2 nil ) 
3D2E 0002 MOVE.W 2(A6),-(A6) 
20VER ( dl d2 -- dl d2 dl ) 
2D2E 0004 MOVE.L 4 (A6),-(A6) 
: >R Ct = i) Store on return stack 
3F1E MOVE.W (A6) +,-(A7) 
R> ¢ == 1, ) Remove from return stack 
3D1F MOVE.W (A7)+,-(A6) 
: I (Se 1 +} Copies top of return stack 
3D17 MOVE.W  (A7),~-(A6) 
; iE (== im) Copies second item on return stack 
3D2F 0002 MOVE.W 2(A7),~-(A6) 


7; J (=> 2.) Copies third item on return stack 


3D2F 


3D3C 


2D3C 


301E 
3B9E 


3016 
3CB5 


301E 
321E 
1B81 


3016 
4241 
1235 
3C81 


301E 
2B9E 


301E 
2D35 


0004 


0000 


0000 
0000 


0000 


0000 


0000 


0000 


0000 


0000 
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MOVE .W 


4(A7) ,-(A6) 


Push a constant onto the stack 


Push a double constant onto the stack 


MOVE .W 


MOVE .L 


#0,-(A6) 


#0, -(A6) 


(== mn) 


Ces 4 


FFI III II III II IIR III IO I IOI OO II KO dk 


Memory and I/O operations 


KK KK IK RIK IR KIKI OR IO IO II OO IOI IO IOI OOO OR Ok Ok 


No 


ce 


te.. all references to memory are relative to A5 unless otherwise 
specified. 


MOVE .W 
MOVE .W 
MOVE .B 


MOVE .W 
CLR.W 

MOVE .B 
MOVE .W 


MOVE .W 
MOVE .L 


(n adr -- ) 
Store in variable 


(A6) +, DO 
(A6) +, 0 (A5,D0.W) 


( adr -- n ) 
Get from variable 


(A6) ,DO 
0(A5,D0O.W), (A6) 


(:¢ adr ==) 
Store in variable 


(A6)+,D0 
(A6) +,D1 
D1,0(A5,D0.W) 


( adr -= © ) 
Get from variable 


(A6) ,DO 

D1 
0(A5,D0.W),D1 
D1, (A6) 


( d adr -- ) 
Store in variable 


(A6) +,D0 
(A6) +, 0 (A5,D0.W) 


( adr -- d ) 
Get from variable 


(A6)+,D0 
0(A5,D0.W) , - (A6) 
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+! (n adr -- ) 
Add n to the loaction pointed to by adr 


Meee 


301E MOVE.W (A6)+,D0 
321E MOVE.W (A6)+,D1 
D375 0000 ADD.W D1,0(A5,D0.W) 

+ M68ARY xxxx (nn -- ) defines an array xxxx n words long 

7 

7 XXXX { n -- adr ) returns the address of the n-th element of xxxx 
303C 0000 MOVE.W  #0,D0 7Array base address 
DO56 ADD .W (A6) , DO 7n + address 
D156 ADD .W DO, (A6é) 72*n + address 

7 

7 M@68CARY xxxx (n -- ) defines an array xxxx n bytes long 

3 XXXX ( n -- adr ) returns the address of the n-th element of xxxx 
303C 0000 MOVE.W #0,D0 ;Array base address 
D156 ADD.W DO, (A6) 7n + address 

7 

; M@68DARY xxxx (n -- ) defines an array xxxx n double words long 

7 

7) XXXX ( n -- adr ) returns the address of the n-th element of xxxx 
303C 0000 MOVE.W #0,D0 ;Array base address 
3216 MOVE.W (A6),D1 on 
E541 ASL.W #2,D1 74*n 
D041 ADD.W D1,D0 74*n + address 
3C80 MOVE.W DO, (A6) 

FILL ( adr nb -- ) 


Fills n bytes of memory beginning at the variable pool 
relative address with the value b. 


Nee Ne ete 


301E MOVE.W (A6)+,D0 7Get b 

321E MOVE.W (A6)+,D1 7Get n 

305E MOVEA.W (A6)+,A0 7;Get variable pool relative address 
D1CcD ADDA.L A5,A0 ;Compute actual address 

6002 BRA.S $02 7Enter loop at proper point 

10C0 $01 MOVE.B_ DO, (AO) + ;Store b and increment address 

51C9 FFFC $02 DBF D1,$01 ;Repeat n times 


RII RK I I IK IK I IO IK I II IO I IOI OR IK IO IO OK III IO I IO IO RIO IOI OR IO RO OK OOK IO ke 


; Note.. the following words reference absoute memory addresses. They should 
7 only be used to reference data and I/O devices that are fixed and 

; outside the environment of the compiler. Under NO conditions should 
; these operations be used to reference data structures created by the 
i compiler. The compiler data structures are relocatable and there is 
: no easy way to find the current location of these data structures. 

; The other operators provided above are much more convenient and 

, preserve the relocatability. 

; 


AW! ( n short -- ) 
3 Store n at the location specified by the short address. 


305E 
309E 


3056 
3C90 


205E 
309E 


205E 
3D10 


305E 
301E 
1080 


3056 
4240 
1010 
3C80 


205E 
301E 
1080 


205E 
4240 
1010 
3D00 


305E 
209E 
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MOVEA.W 
MOVE .W 


AWG 


MOVEA.W 


MOVE .W 


AL! 


MOVEA.L 


MOVE .W 


AL@ 


MOVEA.L 


MOVE .W 


CAW! 


MOVEA .W 


MOVE .W 


MOVE .B 


CAW@ 


MOVEA.W 


CLR.W 


MOVE .B 


MOVE .W 


CAL! 


MOVEA.L 


MOVE .W 


MOVE .B 


CAL@ 


MOVEA.L 


CLR.W 


MOVE .B 


MOVE .W 


2AW! 


MOVEA.W 


MOVE .L 


2AWe@ 


(A6)+,A0 *Get address 
(A6) +, (AO) 7Store n 
( short -- n ) 


Get n from the location specified by the short address. 


(A6) , AO 7Get address 
(AQ), (A6) 7;Get n 
( n long -- ) 


Store n at the location specified by the long address. 


(A6) +,A0 7Get address 
(A6) +, (AO) ;Store n 
( long -- n ) 


Get n from the location specified by the long address. 


(A6) +,A0 7Get address 
(AO) , - (A6) 7Get n 
(c short -- ) 


Store c at the location specified by the short address. 


(A6) +, A0 7Get address 
(A6)+,D0 #Get c 
DO, (A0) ;Store c 

( short -- ¢ ) 


Get c from the location specified by the short address. 


(A6) , AO 7Get address 
DO 
(AO) , DO 7Get c 
DO, (A6) 
( c long -- ) 


Store c at the location specified by the long address. 


(A6) +, A0 7Get address 
(A6)+,D0 7Get c 
DO, (AO) Store c 

( long -- c ) 


Get c from the location specified by the long address. 


(A6)+,A0 7Get address 
DO 
(AO) , DO 7Get c 
DO, - (A6) 

( @ short: -- ) 


Store d at the location specified by the short address. 


(A6)+,A0 ;Get address 
(A6) +, (AO) 7Store d 
¢ Short --— d.) 


Get d from the location specified by the short address. 
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305E MOVEA.W (A6)+,A0 7Get address 
2D10 MOVE.L (A0O),-(A6) 7Get d 
;~=©6.2AL! ( d long -- ) 


; Store d at the location specified by the long address. 


205E MOVEA.L (A6)+,A0 7Get address 
209E MOVE.L (A6)+, (AO) 7Store d 
; 2AL@ ( long -- d ) 
; Get d from the location specified by the long address. 
2056 MOVEA.L (A6),A0 7Get address 
2C90 MOVE.L (AO), (A6) 7Get d 
7 =AFILL ( long_adr nb -- ) 
; Fills n bytes of memory beginning at the long absolute 
; address with the value b. 
301E MOVE.W (A6)+,D0 7Get b 
321E MOVE.W (A6)+,D1 7Get n 
205E MOVEA.L (A6)+,A0 7Get absolute address 
6002 BRA.S $02 7Enter loop at proper point 
10C0 $01 MOVE.B DO, (A0O)+ ;Store b and increment address 


51C9 FFFC $02 DBF D1,$01 ;Repeat n times 


; Comparison operations 


ORO IOI IO II III IOI III IO IO IO IORI IO RIOR ROR ROOK TORK IO 


; MIN ( nl n2 -- n-min ) 
7 
301E MOVE.W (A6)+,D0 7n2 
3216 MOVE.W (A6),D1 nl 
BO41 CMP .W D1,D0 7n2-n1 
6F02 BLE.S MIN 
C141 EXG DO,D1 ;Swap if Dl < DO 
3C80 MIN MOVE.W_ DO, (A6) 
7 MAX ( nl n2 -- n-max ) 
301E MOVE.W (A6)+,D0 a2 
3216 MOVE.W (A6),D1 fn 
BO41 CMP .W D1,D0 7n2-n1 
6C02 BGE.S MAX 
C141 EXG DO,D1 7Swap if Dl > DO 
3C80 MAX MOVE.W DO, (A6) 
; = (38d, Fi2 = 3 *) if nl = n2 then f is true 
301E MOVE.W (A6)+,D0 erZ, 
321E MOVE.W (A6)+,D1 enl 
B240 CMP .W DO,D1 snl-n2 
57C0 SEQ DO 
0240 0001 ANDI.W #1,D0 


3D00 MOVE.W DO,-(A6) 
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= (nl n2 -- f ) if nl < n2 then f is true 
301E MOVE.W (A6)+,D0 ;n2 
321E MOVE.W (A6)+,D1 snl 
B240 CMP .W DO,D1 7nl-n2 
5DCO SLT DO 
0240 0001 ANDI.W  #1,D0 
3D00 MOVE.W DO,-(A6) 

3; > (nl n2 -- f ) if nl > n2 then f is true 
301E MOVE.W (A6)+,D0 7n2 
321E MOVE.W (A6)+,D1 resuld 
B240 CMP .W DO,D1 7nl-n2 
SECO SGT DO 
0240 0001 ANDI.W  #1,D0 
3D00 MOVE.W DO,-(A6) 

7 D= ( aL .d2 =o £ ) if dl = d2 then f is true 
201E MOVE.L (A6)+,D0 7d2 
221E MOVE.L (A6)+,D1 7al 
B280 CMP .L DO,D1 7d1-d2 
57C0 SEQ DO 
0240 0001 ANDI.W #1,D0 
3D00 MOVE.W DO,-(A6) 

;. Ds ( al, a2) == £) if dl < d2 then f is true 
201E MOVE.L (A6)+,D0 a2 
221E MOVE.L (A6)+,D1 sal 
B280 CMP .L DO,D1 zal-d2 
5DCO SLT DO 
0240 0001 ANDI.W #1,D0 
3D00 MOVE.W DO,-(A6) 

7 D> ( dl d2 == .2 ) if dl > d2 then f is true 
201E MOVE.L (A6)+,D0 72 
221E MOVE.L (A6)+,D1 7dl 
B280 CMP .L DO,D1 7d1-d2 
SECO SGT DO 
0240 0001 ANDI.W #1,D0 
3D00 MOVE.W DO,-(A6) 

0= (n-- £ ) if n = 0 then f is true 


Alternate name is NOT 


4A5E TST.W (A6) + 

57CO SEQ DO 

0240 0001 ANDI.W #1,D0 

3D00 MOVE.W DO,-(A6) 

# 0< G ne fy if n < 0 then f is true 

4A5E TST.W (A6) + 

5DCO SLT DO 

0240 0001 ANDI.W #1,D0 


3D00 MOVE.W DO,-(A6) 
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4A5E 
SECO 
0240 
3D00 


4A9E 
57C0 
0240 
3D00 


4R9E 
SDCO 
0240 
3D00 


4A9E 
SECO 
0240 
3D00 


301E 
C156 


301E 
8156 


301E 
B156 


4656 


0001 


0001 


0001 


0001 
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O> (n-- £ ) if n > 0 then f is true 


TST.W (A6) + 
SGT DO 
ANDI.W #1,D0 
MOVE.W DO,-(A6) 


DO= (do== 2 ) if d 


0 then f is true 
TST.L (A6) + 
SEQ DO 


ANDI.W  #1,D0 
MOVE.W DO,-(A6) 


DO< ( SE if d< 0 then f is true 
TST.L (A6) + 
SLT DO 
ANDI.W #1,D0 
MOVE.W DO,-(A6) 
DO> ( ao == £ 5) if d > 0 then f is true 
TST.L (A6) + 
SGT DO 
ANDI.W #1,D0 
MOVE.W DO, -(A6) 
AND ( ul u2 -- and ) 


MOVE.W (A6)+,D0 
AND .W DO, (A6) 


OR ( ul u2 -- or ) 


MOVE.W (A6)+,D0 
OR.W DO, (A6) 


XOR ( ul u2 -- xor ) Exclusive OR 


MOVE.W (A6)+,D0 
EOR.W DO, (A6) 


AS ( u -- compl ) One's compliment 


NOT.W (A6) 


ORI III IC III III ICICI III ICICI ICI IO IK 


; 


Control operations 


FOI II IOI IIIS IIDC CIO ICICI CICICO ICICI ICICI CIO III IK 


Note.. all control structures use the PC relative addressing mode 
this makes the code position independent. 


4A5E 
6700 


6000 


4A5E 
6700 


6000 


4A5E 


6700 


6000 


2F1E 


KKK 


kK 


KKK 


aKKK 


kkk 


KKK 


Meee 


Ne Ne Ne te Ne 


BRA ENDIF 
ELSE 
7 
7 ENDIF ( = } 
i; Provides a target for the branch around the else part, and if 
: the else is missing provides a target for the false branch of 
; the if part. 
ENDIF 
7 BEGIN (=) 
? Provides a target for a branch back from UNTIL, AGAIN, REPEAT. 
BEGIN 
; UNTIL (eS) 
; Takes an entry off the stack and branches to BEGIN if the 
i value is FALSE (0). 
TST.W (A6) + 7Remove and test flag 
BEQ BEGIN 
7 AGAIN (=) 
e Always branches to BEGIN 
BRA BEGIN 
7 WHILE ( £ ==») 
# Takes an entry off the stack and branches to REPEAT if the 
; value is FALSE (0). 
TST.W (A6) + ;Remove and test flag 
BEQ REPEAT 
7 REPEAT C= ) 
; Always jumps back to begin and provides a target for WHILE. 
BRA BEGIN 
REPEAT 
7 DO ( limit index -- ) 
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IF ( £-=- ) 
Takes an entry off the stack and branches if the value 
is FALSE (0). 
TST.W (A6) + ;Remove and test flag 
BEQ ELSE 
ELSE Civ) 


Branches around the else part and provides a target for the 
false branch of the if part. 


Remove the index and limit from the data stack and put on 
the return stack. Provides a target for LOOP and +LOOP 


MOVE.L (A6)+,-(A7) 


LOOP Ca") 
Increment the index and test for end of the loop. 
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5257 ADDQ.W #1, (A7) ;Increment index 
4C97 0003 MOVEM.W (A7),D0/D1 ;Get index (DO) and limit (D1) 
B041 CMP .W D1,D0 
6D00 **** BLT DO ;Continue if (index - limit) < 0 
588F ADDQ.L #4,A7 ;Drop index and limit 
7 
7 +LOOP (n-- ) 
; Add n to the index then 
H IF n > 0 then continue if (index - limit) < 0 
; ELSE continue if (index - limit) >= 0. 
301E MOVE.W (A6)+,D0 7Get increment 
D157 ADD.W DO, (A7) ;Update index 
4C97 0006 MOVEM.W (A7),D1/D2 7;Get index (D1) and limit (D2) 
4240 TST.W DO 7Test for negative 
6E04 BGT.S $01 
B441 CMP.W D1,D2 ;Test (limit - index) 
6002 BRA.S $02 
B242 $01 CMP.W  D2,D1 7Test (index - limit) 
6D00 **** $02 BLT DO ;Continue if (condition tested) < 0 
588F ADDQ.L #4,A7 ;Drop index and limit 
; LEAVE (-=) 
; Terminate loop by seting limit equal to index 
3F57 0002 MOVE.W (A7),2(A7) 


; 
; Routines for accessing external programs and subroutines. 
; 


7 JSR.W ( short address -- ) 
z Jump to subroutine using short address from tos 


305E MOVEA.W (A6)+,A0 
4E90 JSR (A0) 
JSR.L ( long address -- ) 


Jump to subroutine using long address from tos 


205E MOVEA.L (A6)+,A0 
4E90 JSR (AO) 
JMP .W ( short address -- ) 


Jump to location pointed to by short address on stack 


Mee ete 


305E MOVEA.W (A6)+,A0 
4EDO JMP (AO) 
JUMP .L ( long address -- ) 


Jump to location pointed to by long address on stack 


Se Ne oe 


205E MOVEA.L (A6)+,A0 
4EDO JMP (A0) 


ROI TOK IKK IR ROK KIO IOI IO OR IT OR OR RIO IO OK I TOR TOK I IOI IORI ROR IKK KR IR RK KR KKK RK KK 


Initialization operations 


HRI IK KKK IKK KK KK IKK KKK KK IKK OK IKK SOK KK OK KOK KKK IORI TOIT TORK IK IKK OK KR KKK IK IK KR KK KK KK 


Mee Nee Nee ee 
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Note.. the following words initialize the registers of the M68000 


7 ~=ASLD Load the variable pool pointer 
2A7C 0000 MOVEA.L #0,A5 
0000 
+ A6LD Load the data stack pointer 
2C7C 0000 MOVEA.L #0,A6 
0000 
; AT7TLD Load the return stack pointer 
2E7C 0000 MOVEA.L #0,A7 
0000 
END 


B> 


Listing 8.3 


Screen # 8 


0 ( Eratosthenes Sieve Prime Number program in FORTH ) 
1 ( by Jim Gilbreath, BYTE September 1981 page 190 ) 
2 FORTH DEFINITIONS DECIMAL 


3 8190 CONSTANT SIZE 0 VARIABLE FLAGS SIZE ALLOT 
4 

5 DO-PRIME FLAGS SIZE 1 FILL 

6 0 SIZE 0 

7 DO FLAGS I + C@ 

8 IF I DUP + 3 + DUP I + 

9 BEGIN DUP SIZE < 
10 WHILE 0 OVER FLAGS + C! OVER + REPEAT 
Ld: DROP DROP 1+ 

12 THEN 

13 LOOP 

14 s © primes: ™ + 

5 


Screen # 9 


( Eratosthenes Sieve Prime Number program for M68K compiler ) 

( Original by Jim Gilbreath, BYTE September 1981 page 190 ) 
DECIMAL 0 M68CON #0 ( Note.. 0 is used a lot in the following ) 
8190 CONSTANT SIZE SIZE M68CON SIZE ( Both forms needed ) 
M68VAR FLAGS SIZE M68ALLOT 


:M68MAC DO-PRIME FLAGS SIZE 1 LITERAL FILL 
#0 SIZE #0 
DO FLAGS I + C@ 
IF I DUP + 3 LITERAL + DUP I + 
BEGIN DUP SIZE < 


0 
1 
2 
3 
4 
5 
6 
7 
8 
9 
0 
1 WHILE #0 OVER FLAGS + C! OVER + REPEAT 


156 


12 
13 
14 
LB; 2 


Screen # 10 


HEX 
:M68MAC 


;M68MAC 
DECIMAL 


0 
dl. 
2 
3 
4 
5 
6 
uh 
8 :M68K TE 
9. 
10 


11 ;M68K 


Screen # 11 


DR. DOBB'S TOOLBOOK OF 68000 PROGRAMMING 


DROP DROP 1+ 
THEN 
LOOP ;M68MAC 


( Test program to run the prime number program ) 


INIT ( Initialize all the registers ) 


800. AS5SLD ( Load variable pointer ) 
4000. A6LD ( Load data stack pointer ) 
7800. A7LD ( Load return stack pointer ) 


ST ( Run the prime number test ten times ) 
INIT 
10 LITERAL #0 DO DO-PRIME DROP LOOP 


0 ( Eratosthenes Sieve Prime Number program improved version ) 

1 ( Original by Jim Gilbreath, BYTE September 1981 page 190 ) 

2 DECIMAL 0 M68CON #0 ( Note.. 0 is used a lot in the following ) 
3 8190 CONSTANT SIZE SIZE M68CON SIZE ( Both forms needed ) 
4 SIZE M68CARY FLAGS 

5 

6 :M68MAC DO-PRIME #0 FLAGS SIZE 1 LITERAL FILL 

7 #0 SIZE #0 

8 DO I FLAGS C@ 

9 IF I 2* 3 LITERAL + DUP I + 

10 BEGIN DUP SIZE < 
dL WHILE #0 OVER FLAGS C! OVER + REPEAT 
12 2DROP 1+ 
13 THEN 
14 LOOP ;M68MAC 
15 ==> 


Screen # 12 


( Test p 
HEX 
:M68MAC 


0 
nl 
2 
3 
4 
=) 
6 ;M68MAC 
7 DECIMAL 
8 :M68K TE 
9 

10 

11 ;M68K 

12 

13 

14 

U5: 


rogram to run the prime number program ) 


INIT ( Initialize all the registers ) 


800. ASLD ( Load variable pointer ) 
4000. A6LD ( Load data stack pointer ) 
7800. A7LD ( Load return stack pointer ) 


ST ( Run the prime number test ten times ) 
INIT 
10 LITERAL #0 DO DO-PRIME DROP LOOP 
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FOO IO I ORR OO OO OR IO TOO IO IOFOTO I IO IOI IORI I IO II IO IOI IO IO 


- PROC PRIME 
; Eratosthenes Sieve Prime Number program in M68000 assembly language. 
; This program provides a baseline for evaluating the performance of the 


7 M68K compiler. 


; Register variables: 


; DO.. Temporary storage 

; D1l.. Number of iterations 

3 D2.. I - DO loop counter 

; D3.. P - candidate prime number 
; D4.. K - array index used with P 
; D5.. COUNT - number of primes 


; D6.. SIZE - size of the flags array 


; AOQ.. FLAGS - base address of the FLAGS array 


; Al.. Temporary address register used for initializing FLAGS 

7 Note.. 

- This program does not correspond exactly to the FORTH version but 
the algorithm is the same so this should be a fair comparison. 


; The portions of the FORTH code which correspond to the sections of 
; assembler code are indicated in the comments. 


SIZE - EQU 8190 
FLAGS -EQU 800H ;Base address of the FLAGS array 
ITER -EQU 10 ;Number of iterations of the sieve 


MOVE.W #ITER,D1 

MOVE.W #SIZE,D6 

MOVEA.W #FLAGS, AO 

BRA.S ENDIL ;Enter the iteration loop at the proper place 
STARTIL ;Start of the iteration loop 
; FORTH code: 
; FLAGS SIZE 1 FILL 


MOVEQ #1,D0 


MOVE.W D6,D2 ;Load SIZE into D2 
MOVEA.L AO,Al1 ;Address of element of FLAGS to set 
BRA.S $02 

$01 MOVE.B_ DO, (Al) + 


$02 DBRA D2,$01 


; FORTH code: 


; 0 SIZE 0 DO 
CLR.W DS ;Clear prime counter 
CLR.W D2 7Clear DO loop counter 
DOLOOP 


; FORTH code: 
; FLAGS I + C@ IF 


BTST #0,0(A0,D2.W) 
BEQ.S THEN 7If false, skip the true part of IF structure 
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7; FORTH code: 


I DUP + 3 + DUP IT + 


MOVE .W 
ADD.W 
ADDQ.W 
MOVE .W 
ADD.W 


7 FORTH code: 
BEGIN DUP SIZE < 


$03 


; 


WHILE 0 OVER FLAGS + C! 


D2,D3 
D3,D3 
#3,D3 
D3,D4 
D2,D4 


DROP DROP 1+ 


CMP .W 
BGE.S 
CLR.B 
ADD.W 
BRA.S 
ADDOQ.W 


; FORTH code: 


THEN 


ENDIL 


B> 


THEN 
LOOP 


ADDQO.W 
CMP .W 
BLT.S 


DBRA 
. END 


D6,D4 

$03 

0 (A0,D4.W) 
D3,D4 
BEGIN 
#1,D5 


#1,D2 
D6,D2 
DOLOOP 


D1,STARTIL 


7P=I+I+3 


7K=P+I 


OVER + REPEAT 


;Update prime counter 


;Repeat for requested number of iterations 


9 


A 68000 
Forth Assembler 


Michael A. Perry 


This chapter describes the nature of assemblers in Forth and 
the use and implementation of a fairly typical example: a 
Forth assembler for the 68000. It discusses some of the 
tradeoffs involved, but for clarity it ignores some 
implementation dependencies. This chapter was originally 
published in Dr. Dobb's Journal, September 1983. 


Forth system is an interactive programming environment in which rou- 
Avtines (called words) are kept in a data structure called the dictionary. The 
programmer can add new words, which can be defined either in terms of 
existing words or in terms of machine code. An assembler in a Forth 
environment is usually a tool for defining words in machine code. It is not 
intended for writing standalone applications in assembler. 

A Forth cross-assembler is a very similar tool that produces code that will 
run a different environment, possibly on a different processor. (Here I am 
concerned only with ordinary Forth assemblers.) These assemblers are small 
because they use existing Forth words, so the size of the assembler is only an 
increment to the size of the system. For example, this assembler requires 
about 3K added to a 12K system. For comparison, the assembler provided 
with CP/M-68K requires 44K plus 6K for a symbol table file. 

In writing applications, you rarely use the assembler until the algorithms 
have been tested and debugged in high-level code. At the initial stages of a 
design, the programmer's time is far more valuable than the machine's. When 
an application is running, it might prove too slow. If so, determine where the 
most time is being spent and rewrite that routine's code in assembly 
language. Repeat this process until the application is fast enough. Avoid 
writing in assembly code unless necessary—it limits portability. In rare 
cases, if you have a very time-critical application, you may end up with 
nearly everything in assembly code. Even then, writing the high-level code 
first will produce results, and the final product, most quickly. Always be 
prepared to throw out early designs and start over. The key to success is 
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iteration: Do it over until you get it right. That's why it is so important to 
implement a simple version of the application first to find out if your basic 
idea is workable. 


Forth Vocabularies 

The name of a Forth words can be any string of up to 31 ASCII characters, 
excluding the space character. Words in the dictionary are organized into 
groups called vocabularies. The assembler is a vocabulary named 
ASSEMBLER. Most words in the assembler are named for the instruction 
mnemonics of the host processor, in this case the 68000. When such a word 
is executed, it appends an appropriate sequence of bytes to the dictionary. 
Other words in the assembler are used for addressing modes, structured 
conditionals, macros and possibly other extensions. Given a few constraints, 
it is possible to implement a very powerful assembler quite simply. 

Two major constraints are on syntax and forward reference. As is generally 
true in Forth, forward reference is not allowed. That is, you cannot use 
anything until it is defined. I believe this constraint has merits, but it raises 
endless debate that I won't try to end here. It is much simpler (and therefore 
faster in execution) to use an operator final syntax, which means that 
instructions are written in the form 


source destination operation 


While unconventional, this format is very flexible and simple to use. An 
algebraic syntax preprocessor could easily be used if the speed degradation 
could be tolerated. 

The dictionary grows toward high memory as new words are added. Most 
data structures are also allocated within the dictionary. The variable DP points 
to the next free address. The word HERE returns the value of DP. The word , 
(comma) appends a 16-bit value to the dictionary. The word C, (C-comma) 
appends an 8-bit value (1 byte). The assembler is built on comma and C- 
comma. 


Error Control 

When I use an assembler, I expect it to have several important character- 
istics. The first is correctness: Good input results in good output. The second 
is speed: I want the assembler to produce its output as rapidly as possible. 
The third is predictability: When I write assembler code, I will do the 
optimizing, instead of using an optimizing assembler—I hate surprises. 
Finally, I prefer to avoid overlaid (or "smart") operators—that is, operators 
that allow me to be lazy by deciding that this time ADD really means ADDI, 
ADDQ or something else. They are slower and less predictable than non- 
overlaid operators. Because Forth assemblers are extensible, anyone can add 
smart operators on top of the dumb ones. 

Any amount of error checking can be added to an assembler. Ideally, an 
assembler should only accept legal input. It can be expensive and time- 
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consuming, however, to provide exhaustive error control. Fortunately, most 
errors can be caught simply. The cost of checking that the stack depth does 
not change during a definition (nothing extra is left or consumed) and that the 
structured conditionals are balanced is very small. 

The next level of error detection is to test that only allowed addressing 
modes are used with each instruction. For a very orthogonal architecture, this 
procedure is easy. Unfortunately, the 68000 does not quite qualify because 
there are some exceptions; for example, PC-relative writes are not allowed. 
Even so, many cases can be easily checked. Although I do not ordinarily use 
them, I have included some words for checking that instructions are given 
valid addressing modes. ??DN will abort unless fed a data register direct mode. 
22AN will abort unless fed an address register direct mode. ??JMP will abort 
unless fed a valid jump mode. Beyond this level, one is faced with rapidly 
diminishing returns; much more effort gives only small gains. 


Assembler Usage 

For a detailed and fairly accurate description of the MC68000, see the 
manual. As an example of how the assembler is used, take the definition of 
the word FILL, which fills an area of memory with a given byte. It is used in 
the form 


address length byte FILL 


Notice that FILL takes three parameters from the stack and does not leave 
any. The definition of FILL is shown in Figure 9.1. The word CODE is a 
defining word. It creates a header for the new word named FILL, sets its code 
field to point to its parameter field and leaves the system in execution state. 
The assembler does not use the Forth compiler, as is often assumed. The 
header is something like a symbol table entry. The code field of any word 
points to the code to be executed for that word. Ordinarily, all words with the 
same parent (defining word) share a code segment in the parent. Words defined 
by CODE consist of a unique code segment, contained in the body of the 
word, which is pointed to by the code field of the word. The remaining words 
in the definition assemble a sequence of bytes into the body of the word. 


CODE FILL (S adr len val -- ) 
SP )+ DO MOVE ( pop 'val' into DO ) 
SP )+ D1 MOVE ( pop 'len' into Dl ) 
SP )+ D7 MOVE ( pop 'adr' into D7 whose high half is 0 ) 
D7 AO LMOVE ( move all of D7 into AO ) 
1 D1 SUBQ ( decrement Dl: DBRA goes to -1l, not to 0 ) 


D1 DO DO AO )+ BYTE MOVE LOOP 
( loop until Dl is -1, each time moving the byte 
in DO to the address in AQ and incrementing AO ) 
NEXT ( compile a jump to 'next' ) 
END-CODE ( end definition ) 


FIGURE 9.1 Description of FILL. An example of the 
MC68000 Assembler. 
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Assembler opcode mnemonics such as MOVE use the word , (comma) to 
append numbers in line to the parameter field of the word being defined. If the 
new word is to return to Forth after executing, its definition will end with 
NEXT, a macro that assembles a jump to Forth's address interpreter. Its 
definition is 


: NEXT (NEXT) #) JMP; 


where NEXT is the address of the interpreter, #) indicates the absolute word 
address mode, and JMP uses comma to append the proper opcode and address 
to the word being defined. END-CODE marks the end and does some error 
checking and housekeeping. 

SP is the name of the Forth virtual machine's stack pointer. The word SP 
leaves a value on the stack, representing direct-addressing mode using a7 
(address register 7). The word a7 has the same effect: both are just constants. 
The word )+ modifies the value left by SP to indicate the "indirect with auto- 
increment" addressing mode. How this works will be covered later. The word 
dO represents data register 0. 

The word MOVE assembles a 68000 move instruction. MOVE requires a 
pair of arguments, each representing an addressing mode. In this case, the 
assembled code will move 16 bits from the address pointed to by SP into DO, 
and increment SP by 2. The size of the operation is determined by the 
contents of the variable SIZE and defaults to 16-bit at the start of each 
definition. SIZE is set by BYTE, WORD and LONG. Notice the difference 
between WORD in the ASSEMBLER vocabulary and WORD in the Forth 
vocabulary. In the Forth vocabulary, the word WORD accepts the next token 
from the text input stream. 

The word CODE leaves the system in the ASSEMBLER vocabulary, so the 
correct word will be found. The word LMOVE was added as a special case of 
MOVE, because of the register shuffle mentioned earlier. LMOVE always 
assembles a long move, without changing SIZE. Notice the use of dO and 
LOOP in the assembler. dO is passed a data register which will contain the 
loop count at execution time. dO passes HERE and the register on to LOOP, 
which assembles a DBRA back to the dO, using the given register. 


Implementation 

There are many possible and two major approaches used in writing 
assemblers in Forth. One method is to use many variables to control the 
assembly of each instruction, then to clear them for the next instruction. A 
more common and more desirable approach is used here. Almost all 
information is passed on the stack, which does need not be initialized. 

There are also two major approaches to interpreting the addressing mode 
information passed to an assembling word. One is to use something 
equivalent to nested IF-ELSE statements to step through a sequence of choices 
(conditional branches) of what to do. The other approach, used here, is to have 
the addressing mode words pass values that can simply be masked and merged 
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(ANDed and ORed) to produce the required machine code values. In either case 
the resulting numbers are then appended to the dictionary. Logic calculations 
proceed far faster than decisions as a rule, and so the assembler tends to be 
much faster when using them. 

While reading the following descriptions, refer to the listings and their 
shadows (comments). The key idea is that a machine code instruction is 
viewed as a collection of bit fields. These bit fields are specified in the manual 
for the CPU chip. Some are common to many instructions, like the source 
and destination, mode and register fields. 

As mentioned earlier, instructions that need to know the size of the data to 
be operated on generally use the variable SIZE. The position of the bit fields 
that control data size moves around more than any other. Nearly all required 
values are built into the value of SIZE in each of the three cases: BYTE, 
WORD and LONG. Notice that at this point in the code I switched to OCTAL. 
The 68000 instructions contain many 3-bit fields and can be neatly and most 
naturally represented as octal digits. I was forced to forget temporarily my 
bias toward hexadecimal. The bit fields for modes and registers are usually in 
this pattern: 


opcode | destreg | destmode | source mode | source reg 
15 12 | 11 9 | 8 6 | 5 3 1 2 0 


To define the words that specify registers and addressing modes, I used a 
small trick. I created a multi-defining word, REGS, which uses CONSTANT in 
a loop to CREATE several related constants. REGS is used only twice: once 
for the data registers and once for the address registers, which are modes 0 and 
1: 


Mode 0 is data register direct, so d5 is 5005. 
Mode | is address register direct, so a3 is 3113. 


Words defined by MODEare used after an address register, and replace the 
two mode digits (which were 1) with the new mode values. This is done by 
ANDing off the old values and ORing in the new. They are all address register 
indirect, with extras. 


Mode 2 is address register indirect, so a6) is 6226. 

Mode 3 is the same, with post-increment, so a7)+ is 7737. 

Mode 4 is the same with pre-decrement, so a7-) is 7447. 

Mode 5 is the same, plus displacement, so 123 al D) is 1551, with the 
displacement value (123) remaining unaffected under the register/mode value 
on the stack. 

Mode 6 is the same, plus an index and a displacement. 

123 D4 al DD) is 123 under 4004 under 1661. 
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The remaining possible mode value, mode 7, is used for all remaining 
modes, which are distinguished by values in the register fields. These modes 
are defined as constants. #) is 0770, and represents the absolute (16-bit) 
addressing mode. The name represents "immediate indirect" (think about it). 


L#) is 1771, and represents the long absolute (32-bit) addressing mode. 

PCD) is 2772, and represents the program counter relative with displace- 
ment mode. 

123 PCD) is 123 under 2772. 

PCDI) is 3773, and represents the program counter relative displaced, in- 
dexed mode. 

123 D4 PCDI) is 123 under 4004 under 3773. 

# is 4774, and represents immediate data, 16 or 32 bits. 

456 # is 456 under 4774. 


Notice that in every case from one to three items (16-bit numbers) are left 
on the stack to represent an addressing mode. The top item usually becomes 
part of the first 16 bits of an instruction, along with the opcode. Extra items, 
if any, are assembled following the opcode word. 

Certain fields are selected (by masking off the rest) more frequently than 
others. The word FIELD creates words that select given fields. RS and RD 
select the source and destination register fields. MS selects the source mode 
field. The generic term for a completely specified addressing mode is an 
effective address (EA). The word EAS selects the source effective address, 
which consists of the source mode and register fields. LOW selects the low 8 
bits. The opcode word often contains an EAS field. The word SRC performs 
OVER EAS OR, which installs that field into the opcode word. The word DST 
installs the destination register field. 

The Forth virtual machine contains five registers that are assigned to 
particular 68000 registers. For convenience, either the 68000 or virtual 
machine register names may be used with the assembler. 

Any mode with extra values assembled following the opcode word is said 
to use extended addressing. The extended addressing modes are handled by six 
words and a buffer. DOUBLE? leaves a flag that is true if the mode on top of 
the stack requires that 32 extra bits be assembled. INDEX? looks at a mode and 
modifies its extra items into the proper format if it is an indexed mode. 
MORE? leaves a true flag if the mode has extra items. MORE, appends any 
extra items to the dictionary (following the opcode word). 

Some instructions require two modes, to specify both source and 
destination addresses. The source mode is specified fitst, so it is below the 
destination mode on the stack. Each mode consists of one to three values on 
the stack. The source mode is used before the destination mode, so it is 
necessary to move the destination mode values to a buffer until needed. 
EXTRA? saves any extra items in a buffer named EXTRA, and leaves only the 
mode value. EXTRA, retrieves such extra items and appends them to the 
dictionary. 
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Defining Words 


Most of the rest of the assembler consists of the definition and use of 
defining words that create groups of mnemonics. A couple of examples will 
suffice. If you are not familiar with defining words in Forth, you should be. 
That is its most powerful feature. 

The word IMM creates words that implement immediate assembler 
instructions. I will repeat its definition here and go through it in detail. 


: IMM 
CONSTANT 
DOES> @ >R EXTRA? EAS R> OR 
SZ3 , LONG? ?, ,EXTRA ; 


Usage: 3000 IMM ADDI 
Assembly: n ea ADDI 
Example: 123 A5 ) ADDI 


Each time IMM is used to define an instruction mnemonic word, it saves 
in the definition of that word a constant value that distinguishes it from other 
immediate words. That value is the opcode for the instruction. Immediate 
instructions contain the following bit fields: 


opcode | size | mode | _ reg 
IS £17 615 #1 2. 8 


These codes are followed by 16 or 32 bits of data. When the instruction 
word is executed, it performs the code following DOES> in IMM, with the 
address of its own parameter field on the top of the stack. It is that address 
that the constant (opcode) is built into. That value is read by the word @ and 
saved on the return stack by >R. ADDI is passed the immediate data to use 
under the mode items for an effective address. EXTRA? saves any extra items, 
EAS selects the mode and register fields to use, then the opcode is retrieved 
from the return stack with R> and combined with the EAS by OR. SZ3 
installs the appropriate size-bits from SIZE, and the word comma appends the 
opcode word to the dictionary. This leaves only the immediate data on the 
stack, and LONG? determines whether ?, should append 16 or 32 bits to the 
dictionary. Finally, the saved extras (if any) are retrieved and appended 
following the immediate data by ,EXTRA. 

Numerous other defining words built along very similar lines are used to 
define most of the remaining instructions. Many words are in a class by 
themselves and so are defined by the word : (colon), just like macros. The 
conditional instructions are so regular that I used another trick defining word. 
SETCLASS repeatedly uses a given defining word, each time with a different 
argument, to define multiple words at once. It is used to define all 46 
conditional instructions by repeating each of the three defining words 16 
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times. Two junk mnemonics are also created but will not be used. It would be 
slightly better in this case to define all 46 words separately, but I wanted to 
show what kinds of things can be done. 

Finally, we come to the structured conditionals. The following examples 
will be discussed: 


A3 )+ D1 CMP 0< 


IF DO A7 ) MOVE 
ELSE A7 ) DO MOVE 
THEN 

BEGIN A3 D2 CMP 0= 
WHILE AO )+ DO MOVE 
REPEAT 


In the first example, the result of a comparison is to affect certain flags in 
the status register. IF assembles a conditional branch instruction whose 
opcode (and condition) is specified by O<, which leaves the value for a jump 
if greater or equal to zero. ELSE resolves the branch address for the IF and 
assembles an unconditional branch whose address is resolved by THEN. 
WHILE assembles a conditional branch to the address following the REPEAT, 
which assembled an unconditional branch back to the BEGIN. 

Notice that there is no need for labels; the large number of labels with 
meaningless names used for branch destinations is a primary source of clutter 
in ordinary assembler code. Also note that the structured conditionals defined 
here use only 1-byte offsets. Because the assembler is one pass, it is 
necessary to reserve the space for the offset before its size is known. Since 
CODE routines in Forth are always rather short, 1 byte is enough. If it is not, 
I simply replace these definitions with nearly identical definitions that use 
only 16-bit offsets. 

Finally, it should be noted that there is no need for words to create data 
structures for use with the assembler. A Forth assembler is part of a Forth 
environment, and any data structure created by ordinary Forth defining words 
can be referred to by assembler routines. For example: 


CODE BAR FOO #) NEG NEXT 
END-CODE 


BAR will negate the contents of the variable FOO. 


Conclusion 

For large assembly applications, this Forth assembler would certainly be 
difficult to use. But in its proper place—when used to define new, faster 
words in the Forth vocabulary—it can't be beat. Its small size and simple 
structure make it a valuable addition to any 68000 Forth system. 
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A 68000 
Cross-Assembler 


Brian Anderson 


The 68000 cross-assembler presented in this chapter is fully 
functional, lacking only provision for relocatable code. 


tend to be obsessive. For more than two years I have been admiring the 

Motorola 68000 from afar, wanting to do something with it but not 
knowing quite what. Although I've worked a bit with assembly language and 
with microprocessor hardware (Z80, Z8000 and M6800), I wasn't really what 
you'd call a "let's get close to the metal" type of hacker. What to do? 

My love affair with Modula-2 started about a year ago. After coming up 
through Fortran, Pascal, and C (with a smattering of COBOL, BASIC and 
even LISP thrown in), I thought that Modula-2 seemed an ideal language. 
Here was a high-level structured language that you could actually use to write 
operating systems and, unlike C, even make sense of the code afterward. I had 
it! I'd write a 68000 assembler in Modula-2. I was off to the races. 


Bottoms Up? 


Modula-2 is a great language for both top-down and bottom-up design. The 
top-down aspect is usually more highly touted, but often the immediate need 
is for some tool that enables you to proceed with a design. The first thing I 
realized as I started to plan this project was that I would need some way to 
handle 32-bit integers. The 68000 uses operands that are up to 32 bits wide, 
whereas the implementation of Modula-2 that I was using (Hochstrasser's 
Modula-2 System for Z80 CP/M) provided only 16-bit integers, since the 
compiler was written before Wirth amended the language to include 
LONGINT, LONGCARD and LONGREAL. 

My first task, then, was to create a bottom-up module to handle the 32-bit 
numbers I would need throughout the project. The LongNumbers module 
provides all the facilities to input, manipulate and output a new data type that 
I called LONG, which acts essentially as an 8-digit hexadecimal number. 
Although I could have used assembly language or tricky machine-dependent 
low-level Modula-2 code to create a more efficient implementation, I decided 
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to forgo efficiency for portability because I plan to transport the assembler to 
other environments. (I have ported the assembler to the IBM PC using the 
Logitech compiler.) 

Listings 10.2 and 10.13 show the implementation and definition modules 
for LongNumbers. The type LONG is simply an array of INTEGER. I chose 
INTEGER instead of CARDINAL or subrange [0 . . 15] to ease handling of 
carry/borrow in the arithmetic procedures. Most of the procedures are pretty 
straightforward, but some may need clarification. CardToLong and 
LongToCard provide conversions, which allow some flexibility so that the 
assembler can accept 68000 addressing offsets (and even constants) in 
hexadecimal, decimal or binary. Because CARDINAL has a much smaller 
range than LONG has, not all conversions are possible—LongToCard returns 
FALSE in such cases. StringToLong converts a sequence of ASCII characters 
into a LONG, returning FALSE if any illegal character is encountered. Like 
much of the code that I write now, LongCompare is patterned after a similar 
C routine for comparing strings. The two output routines, LongPut and 
LongWrite, are different from the rest of the routines in that they don't 
actually use type LONG; instead, they use open array parameters, which 
allows them to output an arbitrary sequence of hex digits. The last two 
routines, AddrBoundW and AddrBoundL, are needed because the 68000 insists 
that certain types of instructions and data begin at "even" addresses. 

One other bit of bottom-up design occurred at the beginning of this project 
and resulted in a general-purpose library routine. I've always liked the way C 
handles command line arguments (with the standard parameters ArgC and 
ArgV). For readers unfamiliar with C, ArgC is a count of the number of 
command line arguments encountered by the operating system, and ArgV is a 
pointer to those arguments. My module CmdLin2 mimics this behavior for 
the Modula-2 environment. The definition module (Listing 10.22) shows 
ArgV as an ADDRESS; it is used in the main program as a POINTER TO 
ARRAY OF POINTER TO STRING in much the same way as C would use 
it. (Note: This is a machine-dependent module—CP/M-80 only! It assumes 
that the command line will be located at memory location 80H, with a count 
in the first byte. Programmers working in other environments will have to 
adapt at least the absolute addressing used in the implementation.) 


Design Phase 

With a few tools in hand and more confidence than any believer in 
Murphy's Law has a right to have, I sat down to do a requirement analysis. I 
came up with the specifications shown in Table 10.1. 

Jumping ahead just a bit: You might want to ask how well the final 
program adhered to these specifications. I fell short in a couple of areas and 
went beyond the original specifications in others. I never did implement the 
RORG assembler directive, because a linker is required for it to be of any use 
and I haven't written one (yet!). One additional pseudo-op (EVEN) is sup- 
ported, however, and limited ASCII string evaluation was added. The error 
messages finally implemented are somewhat more extensive. 
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Two-pass assembler (no macros) 
Written entirely in high-level language 
Supports pseudo-ops: 
ORG & RORG 
EQU 
DC.B 
DC.W 
DC.L 
DS 
END 
Creates formatted listing file: 
Creates S-record file 
SO header record 
S2 data/code records 
S8 trailer record 
Outputs error messages to console 
Undefined opcode/psuedo-op 
Label defined twice 
Undefined label 
Operand inconsistent with opcode 
This addressing mode not allowed 
Phase error 
Numeric constants/operations: 
HEX 
Decimal 
Binary 
+/- 


TABLE 10.1 Specifications for X68000 Cross-Assembler. 


Implementation of X68000 

The X68000 Cross-Assembler is written in standard Modula-2, as defined 
by Niklaus Wirth in the second edition of Programming in Modula-2, 
(Springer-Verlag, 1983). The only possible machine dependency (aside from 
the CmdLin2 module already mentioned) is because of the assumption that 
INTEGERs, CARDINALs and BITSETs all occupy 16 bits of memory. Most 
microcomputer implementations and even several minicomputer imple- 
mentations conform to this standard. (However, at least one compiler reverses 
the order of the bit.) Porting considerations will be mainly in the area of I/O 
library routines. The Hochstrasser library is virtually identical to the Volition 
Systems library, so little more than recompiling should be necessary for this 
popular compiler. 

Wirth states, "With the module we have added another level of granularity 
in program structuring. The difficulties of finding a good partitioning—I 
carefully avoid the word ‘optimal'—are culminated at this level. . . . Lucky 
are those who hit a good solution at the outset, for any change affects all 
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participant [modules]." (From "History and Goals of Modula-2," Byte, 
August 1984.) 

Amen to that! My initial partitioning had a module which I called Parser 
doing the decomposition of source lines into parts of speech, as well as 
syntax analysis and code generation. After that module had grown to more 
than a dozen large procedures and more than a thousand lines of code, I 
conceded that this was not the "optimal" partitioning. In the end, I split the 
original module into three smaller modules: Parser, SyntaxAnalyzer and 
CodeGenerator. This splitting made for some rather awkward variable and type 
importations. With that disclaimer in mind, we can go on to look at the data 
flow diagrams for the finished program. 


Assembly—Pass 1 

The purpose of the first pass through the 68000 source code is mainly to 
build a symbol table. As each instruction is scanned, an address counter is 
advanced based on the length of the instruction (68000 instructions vary in 
length from 2 to 10 bytes). When an EQU pseudo-op is encountered, its value 
must be entered in the symbol table, and when any other label is encountered, 
the value of the current address counter must be entered into the table. Because 
the same syntax analysis routines are used for both passes, some errors are 
reported during this pass. 

Figure 10.1 diagrams the data flow for pass 1. The 68000 source code is 
read one line at a time and split into four parts by the routines in Parser. 
Listings 10.3 and 10.14 give the definition and implementation modules for 
Parser. LineParts is the only procedure exported by Parser, but several 
routines hidden in the implementation module do most of the work. 


— Lines 
Labels Spmbcl 
ymbo! 
Table Error X68 
Mnemonic 
Operands Error 


Messages 
to Console 


Mnemonic 
Opcodes 


Address Count 
or Value 


Code 
Generator 


Operation 
Codes 


Syntax 
Analyzer 


Operands 


Bit Codes 
and 
Mode Information 


FIGURE 10.1 Data flow for pass 1. 
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Parser passes any labels on to the SymbolTable module for entry in the 
symbol table; the opcodes (e.g., MOVE) go to the OperationCodes module 
where the machine code is extracted from a lookup table; and the operands 
(e.g., RO,(A2)) are sent, via the BuildSymTable procedure in CodeGenerator, 
to the SyntaxAnalyzer module where their format is checked and their size 
determined. The definition modules for SymbolTable, CodeGenerator and 
SyntaxAnalyzer are Listings 10.16, 10.19 and 10.18; their implementation 
modules are Listings 10.5, 10.9 and 10.8. 

If a label is present, BuildSymTable passes a value (most often the address 
count) to the SymbolTable module, where it is stored and referenced to its 
label. Although the SymbolTable module has four procedures for managing 
the symbol table, it is mostly the FillSymTab routine that finds work during 
pass 1. The definition module for ErrorX68, shown as Listing 10.15, sends 
error messages to the console and then returns to the main flow when the 
programmer acknowledges the error by pressing any key on the keyboard. 
(The implementation module is Listing 10.4.) 


Assembly—Pass 2 

The major purpose of pass 2 is, of course, to generate machine code, 
which is written to an S-record file on disk. In addition, a formatted program 
listing is created, also on disk. 

Before pass 2 actually starts, the SortSymTab procedure in the Symbol- 
Table module sorts all identifiers into alphabetical order. This organization 
allows their values to be found more quickly during the code generation pass. 
Most of the steps taken in pass | are repeated essentially unchanged during 
pass 2. Figure 10.2 shows the data flow diagram for this pass. Parser still 
performs the same task and passes labels, opcodes and operands onto the same 
modules as before. During pass 2, however, it is the GetObjectCode procedure 
in CodeGenerator that works with the various procedures in the Syntax- 
Analyzer module. 

The two "busiest" routines in the whole process are GetOperand from 
SyntaxAnalyzer and a routine called MergeModes hidden within the 
implementation module of CodeGenerator. GetOperand determines the mode 
and values (if any) for all operands; GetValue, GetSize and several other 
procedures help with the smaller jobs. MergeModes takes all the information 
from OperationCodes, SymbolTable and Syntax Analyzer and combines it to 
produce hexadecimal machine code. 

The Listing and Srecord modules use the machine code from Code- 
Generator to create their files. Listing also gets the complete lines of source 
code from the Parser module to merge it with the object code for that line. 
The result is a formatted listing, with addresses, object code, source code and 
page numbers. As an aid to debugging, the ListSymTab procedure in 
SymbolTable provides a sorted list of all identifiers, along with their values. 
The definition modules for Listing and Srecord are shown as Listings 10.20 
and 10.21; their implementation modules are Listings 10.10 and 10.11. 
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Source Lines 


Labels 


Symbol 
Table Error X68 
Error 
Messages 
Mnemonic to Console 
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Bit Codes Operands 


and 


Operation 
Codes 


Code 
Generator 


Syntax 
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Mode 
Information 


Source 
Lines 
Object Code Object Code 
and Addresses and Addresses 
S-Record 
Formatted Listing File S-Record File 


FIGURE 10.2 Data flow for pass 2. 


The Main Program 


The main program for X68000 is shown as Listing 10.1. The above 
description should make it clear why there isn't too much for the main 
module to do. Its tasks consist of inputting and formatting the 68000 source 
code file name; opening, resetting and closing files; and providing the two 
REPEAT loops that control passes | and 2. 

The most interesting aspect of these jobs is the command line interface. 
Because I hacked at C before I learned Modula-2, I sometimes miss some of 
the facilities provided by that language. As mentioned, the module CmdLin2 
provides facilities similar to C's via the ArgC and ArgV arguments. The 
declaration of ArgV lists it as an ADDRESS; in Modula-2, that makes it 
compatible with all pointer types. What CmdLin2 does internally is create an 
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array of pointers, with one pointing to each argument. ArgV points to that 
array. So, to use the ReadCmdLin procedure, I declare ArgV as: 


POINTER TO ARRAY [1 . . n] OF POINTER TO STRING; 
And each string becomes: 
Argv*[i]*. 


Although this program has only one command argument (the file name), 
CmdLin2 was written as a general-purpose library routine. Incidentally, the 2 
in the name is because the compiler comes with a library module called 
CmdLin, which uses a more conventional (for Modula-2) approach to the 
problem—you bring in the whole command line as a string and parse it into 
arguments yourself. The C approach results in a smaller module but does 
more work! 


Portability 

This project was developed on a Z80 system using the Modula-2 System 
for Z80 CP/M from Hochstrasser Computing AG, but the source code should 
compile and run on most other small-computer implementations of Modula- 
2. I will try to identify those areas that may be impediments to portability. 
The only true machine dependency is because of the assumption that 
INTEGERs, CARDINALs and BITSETs all occupy 16 bits, with the least 
significant bit as bit zero. Although not all Modula-2 libraries are equivalent, 
I have tried to restrict myself to library routines that are common to both the 
Hochstrasser and the popular Volition Systems compilers. That should allow 
a fair degree of portability across many different compiler systems. 


LongNumbers 

The data type LONG (an array of integers) simulates 32-bit hexadecimal 
numbers; the implementation module for LongNumbers (Listing 10.2) 
provides procedures that input, manipulate and output the LONG data type. 

LongClear simply clears all elements of the array to zero. LongAdd 
(LongSub) is a multiple-precision routine that uses an integer for a carry 
(borrow) flag. The idea is to index through the array, calculating the sum 
(difference) while checking for any overflow (underflow); such an overflow 
(underflow) causes the carry (borrow) flag to be set and the result to be 
adjusted. This carry (borrow) is then figured into the next digit's calculation. 

The conversion routines CardToLong and LongToCard use the standard 
hexadecimal/decimal conversion algorithms, with the addition of range 
checking. LongToCard checks the range of the LONG and returns FALSE if 
any of the four most significant digits are anything but 0. LongTolInt is a bit 
more complicated because there are two possible in-range conditions: either 
all unused bits must be Os (positive integer) or all must be 1s (negative 
integer). 
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LongInc and LongDec use LongAdd and LongSub respectively, as well as 
CardToLong, to increment or decrement a LONG data type by any value in 
the CARDINAL range. LongCompare uses the standard comparison 
algorithm often used to compare strings. 

LongPut and LongWrite cause output of an array of integers as 
hexadecimal numbers. They both use an internal filter routine called GetDigit 
to trap integers outside the hexadecimal range. The Size parameter is used to 
allow LongPut and LongWrite to output only a portion of the number in 
cases where a small hexadecimal number is stored as a LONG or to output 
extra long strings of hex digits for the S-records. 

StringToLong converts an array of characters into a variable of type 
LONG. Error checking is done by the ISHEX routine; GetHEX handles the 
digit-by-digit conversion. The two address-bounds routines use the set 
operator IN to force a LONG to specific address boundaries. 


CmdLin2 


I wrote the command line parser (Listing 10.12) as an experiment: I 
wanted to see just how flexible the Modula-2 pointer structure really was. My 
conclusion is that it is just as flexible and powerful as the pointer structure in 
C and much easier to understand. 

This module could not have been written in Pascal because Pascal pointers 
can reference only variables dynamically created by the standard procedure 
NEW. Modula-2 pointers can be made to point to any data type. 

This routine parses the command line buffer of the operating system, 
which is referenced by an absolute variable at 80H, into an array of pointers 
to strings. (Absolute variables are another new feature of Modula-2 and allow 
any variable to be placed at a specific location in memory.) The parsing is 
done without even recopying the buffer by setting a pointer to the beginning 
of each argument and a null terminator at the end (replacing the space that 
normally separates command line arguments). After all arguments have been 
so processed, ArgV is set to point to the pointers. 

The CmdLin2 implementation uses a looping construct that is new to 
Modula-2: LOOP . . . END. This construct has two useful variations: the 
first is an infinite loop; the other, by using the optional EXIT statement, 
allows termination anywhere in the loop—even allowing multiple 
terminations. I know of one university instructor who asks his students to 
prove that WHILE and REPEAT are inappropriate before they can use LOOP. 
Some authors refer to it as an unstructured loop. I tend to agree, instead, with 
Donald Knuth, who feels that all constructs—even the lowly GOTO—have 
an inherent structure; if that structure matches the structure of the problem, 
that's the one to use. 

In CmdLin2, there are three EXIT statements in the LOOP .. . END 
statement. I went through many trials using WHILE and REPEAT, extra 
Boolean variables and all the usual structured "tricks," but none were as clear 
and simple as the LOOP (yet still reliable under all conditions of input). 
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While reading through a Modula-2 textbook recently, I came across several 
examples of a program fragment that was supposed to read integers, add them 
to a sum and then stop when something other than an integer was read. All 
three of the examples either used two read statements, tested the same 
condition twice, or both. These examples, which were supposed to 
demonstrate the correct way to use WHILE and REPEAT loops, actually 
represented a case of trying to shoehorn the problem to fit the structure. Such 
examples appear frequently in programming texts. 

Modula-2's LOOP... EXIT... END construct provides a simple and 
elegant solution: 


(* ReadInt and Done are imported*) 
(* from the standard module InOut*) 
(* Done is set TRUE if ReadInt*) 

(* is successful.*) 


LOOP 
ReadInt (num); 
IF Done THEN 
sum := sum + num; 
ELSE 
EXIT; 
END; 
END; 


Parser 

The name for this module is a bit of a misnomer because all it does is 
split up the 68000 source code into its components: LABEL, OP-CODE, 
SOURCE-OPERAND and DESTINATION-OPERAND (Listing 10.3). The 
algorithm is quite primitive in that it merely scans the line from left to right, 
looking for the various parts, and transfers them into variables called Label, 
OpCode, SrcOp and DestOp. These variables are arrays of characters defined in 
the definition part of this module. The location of each item is noted for later 
use in the error handling module so that the exact location of any error can be 
pointed out. 

The most convoluted part of the scanning process is picking out 
delimiters, especially when the normal delimiter characters get embedded 
within parentheses or quotes. The problem is handled by a couple of flags. 
ParCnt keeps track of (possibly nested) parentheses counts, and InQuotes 
becomes TRUE inside quotes; both are used to prevent incorrect detection of 
delimiters. Parser will flag an error if any identifier or expression is too long. 
Labels and opcodes are limited to 8 characters, and operands (including 
expressions) are limited to 20 characters. 


SymbolTable 

The implementation of this module (Listing 10.5) hides a data type called 
SYMBOL and variables called SymTab, Top and Next. This is an excellent 
example of the way Modula-2 allows you to separate lifetime and scope. 
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These variables must exist during the entire time that the program is running, 
as they pass on information gathered in assembly pass 1 to assembly pass 2. 
Yet, at the same time, you don't want access to these variables except through 
the symbol table routines. To allow Pascal variables to exist for the life of 
the program, they would have to be declared as global, which would risk side 
effects from unplanned access. Modules provide a visibility wall between their 
contents and the rest of the program. SymTab, Next and Top cannot be 
accessed from outside Symbo!Table (limited scope), but they exist for the life 
of the program (global lifetime). 

The FillSymTab routine simply adds a SYMBOL (a record consisting of 
an identifier and a LONG number) to the next open position in SymTab and 
returns an error if no room exists. SortSymTab uses a Shell sort to place the 
identifiers in alphabetical order for easy access. Notice in the Swap routine 
that entire records can be assigned in one statement in Modula-2. Not only is 
this more convenient than assigning one element at a time, but it is also 
more efficient: The compiler can use fast and compact assembly language 
routines to copy the data into the new variable. 

ReadSymTab uses a binary search to find quickly the value associated with 
any identifier. If the symbol is not found, ReadSymTab returns FALSE to the 
calling program. It is here, also, that duplicate symbol table entries are 
flagged (to do it in FillSymTab would require sorting and searching the table 
after each entry—hardly worth the extra time!). 

ListSymTab really returns only one entry in the table and is used by the 
Listing module to provide a symbolic reference table of identifiers at the end 
of the program. 


OperationCodes 

X68000 is a "sort of" table-driven assembler. The mnemonics and data 
used to derive the opcode bit patterns and all addressing mode information 
come from a file called OPCODE.DAT. This file must exist somewhere on 
the system any time X68000 is run because the data must be read into the 
lookup table used in the binary search routine to find the instructions. It is 
not a true table-driven assembler because it lacks the flexibility to accept 
tables of various processors and the table (OPCODE.DAT) is not a text file. 

In Modula-2, implementation modules may optionally specify an initial- 
ization part. This initialization is run on program start-up, before the main 
program runs. The purpose is to set initial conditions within the module. The 
initialization for OperationCodes (Listing 10.6) opens OPCODE.DAT (trying 
first the default drive, then drive A and, finally, drive B), and reads the data 
into an array called Table68K. This data file is stored in compact binary 
format. Instructions is the only routine in OperationCodes; it uses a binary 
search routine to find the correct mnemonic opcode in Table68K. If found, its 
bit pattern, as well as two SETs consisting of (enumerated) addressing modes, 
are returned to the calling program. If the opcode mnemonic is illegal (that is, 
not found), an error is flagged by the error-handling routine in ErrorX68. 


A 68000 CROSS-ASSEMBLER 185 


InitOperationCodes 

This program module (Listing 10.7) is not actually part of X68000 but 
merely creates the data file OPCODE.DAT described above. It contains most 
of the same declarations and definitions that OperationCodes contains as their 
data types and variable have to match exactly. 

The lookup table for the mnemonics is created one mnemonic, bit pattern 
and addressing mode at a time. There are 118 mnemonics (ABCD to UNLK), 
and each is assigned one element of an array. Each element of the array is a 
four-field record. After data is assigned to the array properly, it is written to a 
disk file using the WriteRec procedure. 

Note: The WriteRec procedure may not be available on all imple- 
mentations of Modula-2 but may be added easily by the programmer. It makes 
use of the generic parameter type WORD. Two possible implementations are: 

+ For machines, such as the PDP-11 and many microcomputers, in which 
TSIZE (WORD) = 2: 


WriteRec (f : FILE; Rec : ARRAY OF WORD) 
VAR 


i : CARDINAL; 
ptr : POINTER TO CHAR; 


BEGIN 
ptr := ADR (Rec); 
FOR i := 0 TO HIGH (Rec) DO 


Write (f, ptr’); 
INC (ptr); (* move pointer to next byte *) 
END; 
END WriteRec; 


¢ For machines in which TSIZE (WORD) = 1 and a WriteWord procedure 
exists: 


WriteRec (f : FILE; Rec : ARRAY OF WORD); 


VAR 
i : CARDINAL; 
BEGIN 
FOR i := 0 TO HIGH (Rec) DO 
WriteWord (f, Rec[i]); 
END; 


END WriteRec; 


Similar routines can be developed for reading records, as required by the 
OperationCodes initialization. 

Some libraries provide procedures to read or write multiple bytes to a file. 
These usually require the location (an address) and size (in bytes) of the record. 
In this case, Modula-2's low-level facilities may be used to read or write the 
record: 


WriteNBytes (f, ADR (rec), SIZE (rec)); 
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CodeGenerator 

This module (Listing 10.9) does more than its name suggests. The 
definition module of CodeGenerator lists three procedures: BuildSymTable, 
which "generates the code" for pass 1 (that is, feeds the symbol table); 
AdvAddrCnt, which increments the address counter after each instruction is 
analyzed (used in both passes); and GetObjectCode, which (with the help of 
many other procedures in CodeGenerator and SyntaxAnalyzer) figures out the 
machine code and returns it to the main program as three LONGs. I will try 
to present the procedures in the order that the code would pass through them 
during assembly. You might also want to refer toSyntaxAnalyzer, as these 
modules are tightly coupled and are used in both passes. 

BuildSymTable is used only during pass 1, and, as was mentioned above, 
it is involved with creating the symbol table. It does nothing (returns 
immediately) if there is no opcode. The cascaded IF ELSIF ELSE statement 
determines, first, if any assembler directive is present, and if so, specifies the 
amount of memory reserved (that will vary—EQU reserves no memory, 
whereas DS may reserve any amount). Next, if there is any label present 
(whether there was an assembler directive or not), an entry is made in the 
symbol table. That entry will consist of the current value of the address 
counter except in the case of the EQU assembler directive. 

Notice that an error due to a full symbol table will be detected here; such 
an error is fatal and will cause the assembler to abort. You must then split up 
your program into two or more modules. If there was no assembler directive 
(pseudo-op), GetOperand and GetInstModeSize (procedures from the 
SyntaxAnalyzer module) help to determine the size of the operands. The 
special QUICK mode instructions must be taken into account before 
determining how far the address is going to have to be advanced by this 
instruction. 

During pass 2, GetObjectCode first checks if there is any opcode, and it 
returns without doing anything if there is none. After making note of the size 
extension of the opcode, control is passed to ObjDir, which handles code 
generation for assembler directives (see detailed description later). Phase errors 
are checked by comparing the pass 1 address count (from the symbol table) 
with the pass 2 address count for any line that has a label. The two 
instruction types that use relative addressing modes are handled as a special 
case because of the odd requirements placed on them. (There is no automatic 
selection of the most efficient branch length. The assembler assumes the 
worst case and produces long branches unless explicitly told to use the short 
form. Full range checking is done for either long or short branches, however.) 
Object code for the balance of the instructions is produced in MergeModes 
(described next). 

Although the 68000 instruction set is very orthogonal (regular), there are a 
few instructions that have to be handled as special cases. That is the purpose 
of the CONST definitions at the top of the module: they are BITSET 
constants representing several operation codes. These are used within the 
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MergeModes procedure to take care of the special cases before the more 
common addressing modes are handled. 

MergeModes is not exported from the definition module of CodeGenerator 
but is used by the GetObjectCode procedure to combine information from 
several sources to produce the hexadecimal machine code. MergeModes is at 
the heart of the code generation process. Many 68000 instructions use a 
format that is some variation of ADD Dn,<ea>. The effective address <ea> 
may be any of 12 addressing modes, although few instructions use all 12 
modes. There are four basic groupings of these modes—Data, Memory, 
Control and Alterable—which may be combined in various ways. The local 
procedures EffAdr and OperExt determine the bit patterns needed for the 
effective addressing mode being used along with any operand extension needed 
for that addressing mode. (For example, MOVE D3,600(A2) requires that bit 
patterns of 000011 and 101010 be inserted into the opcode for the source and 
destination operands, and an extension word of 0000001001011000 needs to 
be tagged on after the opcode for the 600 offset.) 

The bulk of MergeModes cross-checks the addressing mode found by 
GetOperand with the modes that are allowed for the current instruction 
(information from OPCODE.DAT and the OperationCodes module). Any 
errors are passed onto the ErrorX68 module, where verbal error messages are 
displayed on the console. If no errors are detected, various bits and pieces of 
information are combined to produce the machine code for the instruction. 

An example will illustrate typical operation for the complete sequence 
needed for code generation. This sequence comprises two steps: converting the 
source code to an intermediate language consisting of sets, enumerations and 
various integral values; and then combining the elements of this intermediate 
language into Motorola 68000 machine language. 


ADD (A2),D6 ;Add the data word addressed by A2 
;to data register D6 


When the Parser module is finished with this instruction, three character 
strings are left: ADD, (A2) and D6. Parser passes ADD to OperationCodes, 
which returns three sets: {15, 14, 12}---> 1101 0000 0000 0000, 
ModeA {OpM68D}, and ModeB{EAOSy}. The first set is the raw bit pattern 
for this instruction—that is, the bit pattern without operand-size bits and 
operand addressing-mode bits added. The other two sets specify the addressing 
modes that the ADD instruction can use and are designations internal and 
unique to X68000. 

The Parser passes (A2) and D6 to the GetOperand procedure in 
SyntaxAnalyzer (through CodeGenerator), where they are analyzed. A record 
of type OpConfig, which contains the mode, value, size and other 
information, is provided for each operand. In this example, those records 
would contain the following data: 
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(A2) Mode ---> ARInd ;register indirect 
Value ---> none 
Loc ---> location on source line 
Rn ---> 2 
Xn ---> none 
Xsize ---> none 
Xtype ---> none 

D6 Mode ---> DReg ;data register 
Value ---> none 
Loc ---> location on source line 
Rn ---> 6 
Xn ---> none 
Xsize ---> none 
Xtype ---> none 


MergeModes will take all pertinent information from the above to produce 
the machine code or, if there are inconsistencies, produce error messages. 

An IF statement checks for each of the possible addressing modes. For 
example, the IF OpM68D IN AddrModeA statement would be used for the 
ADD (A2),D6 instruction because ADD resulted in ModeA {OpM68D} being 
returned from OperationCodes. The Dest.Mode is checked to see that it is 
DReg (it is), so the Dest.Rn (6) is shifted left by 9 and ORed with Op. 
Because shift left is not provided in Modula-2, I used multiplication to 
accomplish the same thing (multiplication by 2 is the same as shift left 1). 
Because of a principle called strength reduction, this multiplication (by a 
power of 2) is not nearly as inefficient as you might think. As part of this 
same IF statement, the size bits are ORed into place depending on the size 
suffix placed on the instruction. 

Because there is no size suffix on ADD (A2),D6, size WORD is assumed. 
The IF EAOSy IN AddrModeB will fill out the operation with more error 
checking and a call to the EffAdr local procedure mentioned above. This 
procedure checks that the mode used is consistent with the instruction, then 
uses bitwise AND/OR to append the correct bits. OperExt would have 
nothing to do on this instruction because neither ARInd or DReg require an 
operand extension. 

All instructions follow a similar format: source line ---> line parts (label, 
opcode, operands) ---> intermediate language (bitsets, enumerations and so on) 
---> machine code. Any or all of the processes involved in reaching these 
states can result in error messages if the source line does not conform to 
correct 68000 syntax. 

The hidden procedure ObjDir is the assembler directive equivalent to 
MergeModes—it produces the code for the directives. It is essentially similar 
to the cascaded IF ELSIF ELSE statement that handles pseudo-ops in the 
BuildSymTable routine except that it has to generate the code, which means 
determining values and setting object code lengths. This procedure also 
handles ASCII strings. I'm not at all satisfied with this section of code, and if 
you think it looks like an afterthought, you're right! The awkwardness results 
from having to pass the string (which is converted to LONG) to the output 
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modules as a 2-byte opcode and two 4-byte operands. The other alternative 
was a major rewrite, which will have to wait until Version 2 (when I will 
have to rewrite some of the code to accommodate a linker, anyway). 


SyntaxAnalyzer 

The procedures within SyntaxAnalyzer (Listing 10.18) are not visible to 
any other module except CodeGenerator (it was originally part of the same 
module). Its purpose is mainly to analyze the operands of the instructions and 
to determine their value and their mode. 

CalcValue and GetValue work together, as their names suggest, to 
determine the value of any operands that have a value. These include decimal 
numbers (0 to 65535); hexadecimal numbers (0 to $FFFFFFFF); binary 
numbers (0 to %1111111111111111); single quoted ASCII literals; the 
symbol for the current value of the program counter (*); and identifiers, which 
may represent any value. GetValue contains a simple left-to-right expression 
evaluation loop that recognizes only addition and subtraction operations. It 
hardly has the elegance of a recursive descent expression parser, but it is 
simple and compact. GetValue uses the LOOP . . . END construct with three 
conditional EXIT statements. Although this could have been done with a 
REPEAT loop, the termination condition would have been awkward, with six 
terms, three ANDs and three ORs. I rest my case! 

Two procedures, called GetSize and GetAbsSize, determine the size of 
operands (BYTE, WORD, or LONG) by looking for a suffix of .B, .W or .L. 
If no suffix is present, size WORD is assumed, as required by the Motorola 
syntax. The GetAbsSize procedure actually creates a slightly nonstandard 
syntax for this assembler. Most 68000 assemblers will automatically choose 
the WORD absolute addressing mode for addresses in the ranges of $8000 to 0 
and 0 to $7FFF and LONG absolute addressing for higher addresses. X68000 
will always use full 32-bit addressing unless specifically instructed to do 
otherwise. The nonstandard syntax is MOVE D0,$6000.W. 

The GetInstModeSize uses a CASE statement to return the size of the 
object code, both in terms of address count and in terms of number of 
hexadecimal digits required. 

GetOperand is the workhorse of SyntaxAnalyzer, as it is responsible for 
performing lexical analysis on the rather complex and varied operands used in 
68000 assembly language. Not much elegance here—this routine simply 
looks for all the possible addressing modes. When it finds a good one, it 
returns all necessary information in the form of the record described above 
under CodeGenerator. If GetOperand finds an impossible addressing mode or 
an out of range register number, it uses ErrorX68 to report the error to the 
user console. 

GetMultReg is a routine that sorts out the MOVEM instructions. This 
instruction is like a multiple PUSH or multiple PULL operation. MOVEM 
DO0-D7/A0-A6,-(SP), for example, will push all 68000 address and data 
registers onto the stack with one instruction. This is accomplished by a mask 
that follows the actual instruction, where each bit in the mask represents a 


190 DR. DOBB'S TOOLBOOK OF 68000 PROGRAMMING 


register; if the bit is set, that register gets moved. The job of GetMultReg is 
to produce that mask from the register list. The — indicates a range: DO—D7 
means all registers between dO and d7 inclusive, and the / is just a separator. 
My strategy was to use a flag and an enumeration type along with nested IFs 
to keep track of what state the calculations were in. That made it easy to 
detect errors because it would cause a transition to an illegal state. Just to 
complicate matters, the mask has to be inverted in certain cases. That 
function is taken care of by a conditional subtraction as the mask is 
constructed. 


Building the Listing 

The Listing module (Listing 10.10) creates the formatted program listing 
with object code and source code together in the usual format. The 
StartListing procedure does nothing but print a heading and initialize the page 
count and line count variables. WriteListLine does what its name 
implies—writes one line of listing to the file. Unused object code fields will 
be skipped automatically (no address is entered in the case of an EQU pseudo- 
op, for example). The LongPut routine from LongNumbers is used to write 
all object code (in hexadecimal) to the file. 

Modula-2 file libraries do not usually contain any way to write a string to 
a file (InOut and Texts do, but that's another story), so I had to write a small 
procedure to do that. The WriteStrF outputs any string to a file and is used 
throughout Listing. 

The WriteSymTab procedure uses information imported from the 
SymbolTable module via ListSymTab to output a symbolic reference table at 
the end of the listing. ListSymTab will return the nth entry in the symbol 
table each time it is called. Both WriteListLine and WriteSymTab make use 
of a procedure called CheckPage to form-feed to the next page and print a new 
page number. The listing file created by this module may be dumped to a 
standard printer with the PIP command. 


Creating S-Record Files 

Creation of S-record files (Listing 10.11) is a bit more complicated than 
the creation of listing files. Count and checksum bytes must be calculated, 
and it is usual to split the source code into equal-length records unrelated to 
the length of the 68000 instructions (that is, some instructions may be split 
with half on one line of the file and the rest on the next line). 

My strategy in this was to accumulate source code until there was enough 
to output a 16-byte record, output that record (saving any extra bytes 
accumulated for the next record), then go back to accumulating more. This 
necessitated two storage arrays and two indexes for accessing them: 
Sdata/Sindex and Xdata/Xindex. Another complication is that records should 
start on boundaries divisible by 16 whenever possible. Only three procedures 
are exported from the definition module of Srecord: StartSrec, WriteSrecLine 
and EndSrec. Several other procedures that are hidden within the im- 
plementation module do much of the work. 
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StartSrec creates the SO (header) record. This record consists mainly of the 
source filename as ASCII characters represented in hexadecimal format. 
However, all S-records must have an address (this is always 0 for the header 
record), a byte count and a checksum. The byte count includes a count for the 
address and the checksum; the checksum is the complement of the 1-byte 
residual of the sum of the address, count and data. 

The WriteSrecLine procedure returns immediately if there is no address to 
write out (this occurs on blank source lines or for EQU statements only). 
Next, Xdata is transferred to Sdata if there was anything left over from the 
previous call to WriteSrecLine. If for any reason the address count passed into 
WriteSrecLine is different from the internal count, any existing data is output 
as a complete S-record, and a new record is started. This would occur any time 
a new ORG statement is encountered in the source code. Finally, each of the 
ObjOp, ObjSrc and ObjDest (object code for opcode, source and destination, 
respectively) are appended to Sdata. If Sdata now contains more than 16 bytes, 
the record is written out to the file (this is detected by AppendData, returning 
FALSE). Any excess object bytes are retained in Xdata. 

The information in Sdata and Xdata is retained between calls to 
WriteSrecLine because these variables are declared within the module (not 
within a procedure). These variables cannot be seen outside the Srecord 
module (local visibility) but remain in existence throughout the lifetime of 
the program (global existence). In Pascal, lifetime and visibility are tied 
together: If a variable exists, it is visible; if it is not visible, it no longer 
exists. That prevents local variables in Pascal from retaining values between 
calls—a very useful feature, as illustrated here. 

The EndSrec procedure outputs any data left over from the final call to 
WriteSrecLine and then outputs a fixed S8 trailer record. 

Actually, this even-boundary and consistent-length business is not required 
by the S-record format because each S-record is totally autonomous in that it 
begins with both its own starting address and a byte count. It is traditional for 
Motorola S-records to be formatted as described above, however. 


ErrorX68 

Most error handling (except that involving files) is done by the ErrorX68 
module (Listing 10.4). This module defines an enumeration type that provides 
12 named error types. The procedure Error outputs the line count (source line 
where the error was found), the source line itself, an arrow pointing to the 
error and an error message. The program is then suspended until the operator 
acknowledges the error by pressing any key on the console keyboard. If more 
than 500 errors occur, the program is terminated with an appropriate message. 

After pass 2 is completed, WriteErrorCount outputs an END OF 
ASSEMBLY message to both the console and the listing file. 


Compiling X68000 
Because many of the modules within X68000 interact in complicated 
ways, the order of compilation is critical. Specifically, if ModuleA imports 
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objects defined in ModuleB, it is clear that ModuleB must be compiled first. 
In all cases, a definition module must be compiled before its implementation 
module can be compiled; however, it goes further than that. The definition 
module of Parser, for example, defines two types (TOKEN and OPERAND) 
that are used in SymbolTable, CodeGenerator, SyntaxAnalyzer and others. 
Therefore, Parser.DEF must be compiled before any module that depends on 
it. If the correct order is not followed, the compiler will produce "undefined 
identifier" errors. Many similar situations exist in any nontrivial Modula-2 
program. 

The compilation order shown in Table 10.2 avoids any problems but is 
not the only possible ordering arrangement. (A harmless circular reference 
exists between Parser and ErrorX68 because each imports objects from the 
other. This only affects the order of execution of their respective 
initialization parts and causes no problems of any sort.) 


Modula-2 Design Strategy 

The implementation module often hides details not apparent in the 
definition module. Obviously, the algorithm is encapsulated inside the 
implementation module, but it goes further than that. Constants, types, 
variables and procedures that are not visible from the definition module may 
contribute an important part to the module's function. (For example, the 
GetDigit, ISHEX and GetHEX routines from LongNumbers as well as the 
LineParts procedure from Parser are unknown to the definition modules, and 
hence they need not be known by any programmer using these modules.) 

Additionally, the module initialization may play an important role in the 
structure of the program, as in the SymbolTable module when SymTab is 
cleared and the indexes for access to the table are set to their starting points or 
as in OperationCodes where data is brought in from a file. Finally, modules 
may be used to control visibility and lifetime fully: Nothing is visible 


1. CmdLin2.DEF, DmdLin2. MOD 
2. LongNumbers DEF, LongNumbers. MOD 
3. Parser.DEF 
4. Code Generator.DEF 
5. SyntaxAnalyzer.DEF 
6. SymbolTable.DEF, SymbolTable.MOD 
7. OperationCodes.DEF, OperationCodes. MOD 
8. Listing. DEF 
9. Srecord.DEF, Srecord.MOD 
10. ErrorX68.DEF, ErrorX68.MOD 
11. Parser. MOD 
12. SyntaxAnalyzer. MOD 
13. CodeGenerator. MOD 
14. X68000.MOD 


TABLE 10.2 Compilation order for X68000. 
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outside a module unless it is exported, but variables belonging to library 
modules exist throughout the life of the program. These features allow large 
programming projects to be constructed of modules of only a few closely 
related procedures. Debugging is easier and maintenance is finally possible! 


Conclusion 

Although X68000 is a fully functional program, I do not consider it a 
completed project—several areas could use improvement. Expression and 
string evaluation should be improved. The first steps would be to expand and 
improve LongNumbers to include multiplication and division and to improve 
efficiency. This can wait until Modula-2 compilers conform to the new 
standard and add long integer and cardinal data types. Constant strings should 
be expanded to at least 80 characters per line. That is harder than it sounds 
because of the way parameters are passed to the Listing and Srecord modules. 

Finally, there is the matter of a linker. To adapt the assembler to provide 
relocation information will require some rewriting of existing code and some 
new code; however, many of the modules could be reused almost intact. Then 
there is the Linker itself—not a trivial task either. If any readers have further 
suggestions on how X68000 can be improved, please pass them along to me. 
Better yet, make the changes yourself and hand the program back into the 
public domain in improved form. 
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Listing 10.1 


MODULE X68000; 


MC68000 Cross Assembler 
Copyright (c) 1985 by Brian R. Anderson 


This program may be copied for personal, non-commercial use 
only, provided that the above copyright notice is included 
on all copies of the source code. Copying for any other use 
without the consent of the author is prohibited. 


FROM Terminal IMPORT 
WriteString, WriteLn, ReadString; 


FROM Files IMPORT 
FILE, FileState, Open, Create, Write, Close; 


FROM Strings IMPORT 
STRING, CompareStr, Assign, Concat, Length, Delete; 


IMPORT ASCII; 


FROM CmdLin2 IMPORT (* Access CP/M command line *) 
ReadCmdLin; 


FROM LongNumbers IMPORT 
LONG; 


FROM SymbolTable IMPORT 
SortSymTab; 


FROM Parser IMPORT 
TOKEN, OPERAND, LineCount, LineParts; 


FROM CodeGenerator IMPORT 


LZero, AddrCnt, Pass2, BuildSymTable, AdvAddrCnt, GetObjectCode; 


FROM Listing IMPORT 
StartListing, WriteListLine, WriteSymTab; 


FROM Srecord IMPORT 
StartSrec, WriteSrecLine, EndSrec; 


FROM ErrorX68 IMPORT 
ErrorCount, WriteErrorCount; 


TYPE 
FileName = ARRAY [0..14] OF CHAR; 
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VAR 
ArgC : 
ArgV : 
SourceFN, ListFN, SrecFN 
Source, List, Srec : FILE; 
Label, OpCode TOKEN; 
SrcOp, DestOp OPERAND; 
EndFile BOOLEAN; 
NumSyms CARDINAL; 
ObjOp, ObjSrce, ObjDest : 
nA, nO, ns, nD: 


CARDINAL; 


FileName; 


LONG; 
CARDINAL; 


PROCEDURE MakeNames (VAR S, L, R 
(* builds names for Source, 


VAR 
T : FileName; 


i, 1 : CARDINAL; 


=] 
il 

D 
i 


i := 0; 1 := 0; 
WHILE (S[i] # OC) AND (S{i] # ' ') DO 
IF S{i] = '.' THEN 


INC 
END; 


(i); 


IF S{i] = ' ' THEN 
Delete (S, i, Length (S) - 
END; 


i); 


Assign (S, T); 
IF 1 = 0 THEN 
Concat (T, 
ELSE 
Delete (fT, 1, i= 
END; 


".ASM", S); 


1); 


Concat (T, 


Concat (T, 
END MakeNames; 


"LST", 
"JS", R); 


PROCEDURE OpenFiles; 
BEGIN 

IF Open (Source, 
WriteLn; 
WriteString ("No Source File: 
WriteLn; 
HALT; 

END; 


SourceFN) 


"Ye 


POINTER TO ARRAY [1..3] OF POINTER TO STRING; 


FileName) ; 
Listing & S-Record files *) 
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(* Command Line *) 


(* temporary work name *) 


(* set Listing & S-rec names to null *) 


(* mark beginning of file extension *) 


# FileOK THEN 


WriteString (SourceFN); 
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IF Create (List, ListFN) # FileOK THEN (* DOS may trap this *) 
WriteLn; 
WriteString ("Cannot create disk files!"); WriteLn; 
HALT; 

END; 


IF Create (Srec, SrecFN) # FileOK THEN 
WriteLn; 
WriteString ("Cannot create disk files!"); WriteLn; 
HALT; 
END; 
END OpenFiles; 


PROCEDURE StartPass2; 
BEGIN 
IF (Close (Source) # FileOK) OR 
(Open (Source, SourceFN) # FileOK) THEN 
WriteString ("Unable to 'Reset' Source file for 2nd Pass."); 


WriteLn; 

HALT; 
END; 
Pass2 := TRUE; (* Pass2 IMPORTed from CodeGenerator *) 
AddrCnt := LZero; (* Assume ORG = 0 to start *) 
ErrorCount := 0; (* ErrorCount IMPORTed from ErrorX68 *) 
LineCount := 0; (* LineCount IMPORTed from Parser *) 
EndFile := FALSE; 


END StartPass2; 


PROCEDURE CloseFiles; 
BEGIN 
IF (Close (Source) # FileOK) 
OR (Close (List) # FileOK) 
OR (Close (Srec) # FileOK) THEN 
WriteString ("Error closing files..."); WriteLn; 
HALT; 
END; 
END CloseFiles; 


BEGIN (* X68000 -- main program *) 
ReadCmdLin (ArgC, ArgV); 
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IF ArgC = 0 THEN 
WriteLn; 
WriteString ("Enter Source Filename: "); 
ReadString (SourceFN) ; 
WriteLn; 
ELSE 
Assign (ArgV*[1]*, SourceFN) ; 
END; 


MakeNames (SourceFN, ListFN, SrecFN); 


OpenFiles; 

WriteLn; 

WriteString (" 68000 Cross Assembler") ; WriteLn; 
WriteString (" Copyright (c) 1985 by Brian R. Anderson") ; 
WriteLn; WriteLn; 

WriteString (" Assembling "); WriteString (SourceFN) ; 


WriteLn; WriteLn; WriteLn; 


(*=-- 
Begin Pass 1 
---*) 
WriteString ("PASS 1"); WriteLn; 
AddrCnt := LZero; (* Assume ORG = 0 to start *) 


EndFile := FALSE; 


REPEAT 
LineParts (Source, EndFile, Label, OpCode, SrcOp, DestOp); 
BuildSymTable (AddrCnt, Label, OpCode, SrcOp, DestOp); 
AdvAddrCnt (AddrCnt) ; 

UNTIL EndFile OR (CompareStr (OpCode, "END") = 0); 


(*--- 


Begin Pass 2 
---*) 
WriteString ("PASS 2"); WriteLn; 
StartPass2; (* get Source file, Parser & ErrorX68 ready for 2nd pass *) 
SortSymTab (NumSyms) ; 
StartListing (List); 
StartSrec (Srec, SourceFN) ; 


REPEAT 
LineParts (Source, EndFile, Label, OpCode, SrcOp, DestOp); 
GetObjectCode (Label, OpCode, 
SrcOp, DestOp, 
AddrCnt, ObjOp, ObjSrc, ObjDest, 
nA, no, ns, nD i? 
WriteListLine (List, AddrCnt, ObjOp, ObjSrc, ObjDest, nA, nO, nS, nD); 
WriteSrecLine (Srec, AddrCnt, ObjOp, ObjSrc, ObjDest, nA, nO, nS, nD); 
AdvAddrCnt (AddrCnt); 
UNTIL EndFile OR (CompareStr (OpCode, "END") = 0); 


EndSrec (Srec); (* Also: Finish off any partial line *) 
WriteErrorCount (List); (* Error count output to Console & Listing file *) 
WriteSymTab (List, NumSyms) ; (* Write Symbol Table to Listing File *) 
CloseFiles; 

END X68000. 


198 


Listing 10.2 


IMPLEMENTATION MODULE LongNumbers; 


(* Routines to handle HEX digits for the xX68000 
(* All but LongPut and LongWrite are limited to 


FROM Files IMPORT 
FILE; 
IMPORT Files; (* Write *) 


IMPORT Terminal; (* Write *) 


(*--- 


(* These objects are declared in the DEFINITION 


CONST 
DIGITS = 8; 
BASE = 16; 
TYPE 
LONG = ARRAY [1..DIGITS] OF INTEGER; 
CONST 
Zero = 30H; 
Nine = 39H; 
hexA = 41H; 
hexF = 46H; 
PROCEDURE LongClear (VAR A LONG) ; 
(* Sets A to Zero *) 
VAR 
i: CARDINAL; 
BEGIN 
FOR i := 1 TO DIGITS DO 
A[i] := 0; 
END; 


END LongClear; 


PROCEDURE LongAdd (A, B 
(* Add two LONGs, 


LONG; VAR Result 
giving Result *) 


VAR 
Carry : INTEGER; 
i : CARDINAL; 
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cross assembler. 
8 digit numbers. 


MODULE *) 


---*) 


LONG) ; 


*) 


*) 
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BEGIN 
Carry := 0; 
FOR i := 1 TO DIGITS DO 
Result [i] := (A[{i] + Carry) + B[i]; 
IF Result[i] >= BASE THEN 
Result[i] := Result[i] - BASE; 
Carry := 1; 
ELSE 
Carry := 0; 
END; 
END; 


END LongAdd; 


PROCEDURE LongSub (A, B : LONG; VAR Result : LONG); 
(* Subtract two LONGs (A - B), giving Result *) 


VAR 
Borrow : INTEGER; 
i : CARDINAL; 


BEGIN 
Borrow := 0; 
FOR i := 1 TO DIGITS DO 
Result[i] := (A[i] - Borrow) - B[i]; 
IF Result[i] < 0 THEN 
Result[i] := Result[i] + BASE; 
Borrow := 1; 
ELSE 
Borrow := 0; 
END; 
END; 


END LongSub; 


PROCEDURE CardToLong (n : CARDINAL; VAR A : LONG); 
(* Converts CARDINALs to LONGs *) 


VAR 
i: CARDINAL; 


BEGIN 
LongClear (A); 


1, §= 13 

REPEAT 
A[i] := n MOD BASE; 
INC (i); 
n := n DIV BASE; 


UNTIL n = 0; 
END CardToLong; 
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PROCEDURE LongToCard (A : LONG; VAR n : CARDINAL) : BOOLEAN; 
(* Converts LONG TO CARDINAL, returns FALSE if conversion impossible *) 


BEGIN 
n := (A[4] * 4096) + (A[3] * 256) + (A[2] * 16) + ALL? 
RETURN ((A[5] = 0) AND (A[6] = 0) AND (A[7] = 0) AND (A[8] = 0))7 


END LongToCard; 


PROCEDURE LongToInt (A : LONG; VAR n : INTEGER) : BOOLEAN; 
(* Converts LONG to INTEGER, returns FALSE if conversion impossible *) 


VAR 
TempC : CARDINAL; 
Neg : BOOLEAN; 


BEGIN 

IF (A{5] = 0) AND (A[6] = 0) AND (A[7] = 0) AND (A[8] = 0) THEN 
Neg := FALSE; 

ELSIF (A[5] = 15) AND (A[6] = 15) AND (A[7] = 15) AND (A[8] = 15) THEN 
Neg := TRUE; 

ELSE 
RETURN FALSE; (* Out of INTEGER range *) 

END; 

TempC := (A[4] * 4096) + (A[3] * 256) + (A[2] * 16) + A[1]; 

IF ((TempC <= 32767) AND (NOT Neg)) OR ((TempC > 32767) AND Neg) THEN 
n := INTEGER (TempC) ; 
RETURN TRUE; 

ELSE 
RETURN FALSE; 

END; 


END LongToInt; 


PROCEDURE LongInc (VAR A : LONG; n : CARDINAL) ; 
(* Increment LONG by n *) 


VAR 
T : LONG; 


BEGIN 
CardToLong (n, T); 
LongAdd (A, T, A); 
END LongInc; 


PROCEDURE LongDec (VAR A : LONG; n : CARDINAL) ; 
(* Decrement LONG by n *) 


BEGIN 
CardToLong (n, T); 
LongSub (A, T, A) 
END LongDec; 
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PROCEDURE LongCompare (A, B : LONG) : INTEGER; 
(* Returns: 0 if A = B, -1 if A < B, +1 if A> B *) 


VAR 
i: CARDINAL; 


BEGIN 
i := DIGITS; 
WHILE (i > 0) AND (A[i] 
DEC (i)? 
END; 


B[iJ) DO 


IF i = 0 THEN 


RETURN 0; 
ELSIF A[i] < B[i] THEN 
RETURN -1; 
ELSIF A[i] > B{i] THEN 
RETURN +1; 
ELSE 
(* Impossible! *) 
END; 


END LongCompare; 


PROCEDURE GetDigit (n : INTEGER) : CHAR; 
(* Function returning HEX character corresponding to digit *) 


BEGIN 
IF (n >= 0) AND (n <= 9) THEN 
RETURN CHR (CARDINAL (n) + Zero); 
ELSIF (n >= 10) AND (n <= 15) THEN 


RETURN CHR ((CARDINAL (n) - 10) + hexA); 
ELSE 

RETURN '*'; 
END; 


END GetDigit; 


PROCEDURE LongPut (f : FILE; A : ARRAY OF INTEGER; Size : CARDINAL); 
(* Put LONG number in FILE f *) 


VAR 
i: CARDINAL; 


BEGIN 
IF Size = 0 THEN 
RETURN; 
END; 


DEC (Size); (* adjust for zero-based array *) 
IF Size > HIGH (A) THEN 

Size := HIGH (A); 
END; 


FOR i := Size TO 0 BY -1 DO 
Files.Write (f, GetDigit (A[i])); 
END; 
END LongPut; 
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PROCEDURE LongWrite (A : ARRAY OF INTEGER; Size : CARDINAL); 
(* Write LONG number to console screen *) 


VAR 
i : CARDINAL; 


BEGIN 
IF Size = 0 THEN 
RETURN; 
END; 


DEC (Size); 

IF Size > HIGH (A) THEN 
Size := HIGH (A); 

END; 


FOR i := Size TO 0 BY -1 DO 
Terminal.Write (GetDigit (A[i])); 
END; 
END LongWrite; 


PROCEDURE ISHEX (c : CHAR) : BOOLEAN; 
(* checks if c is one of 0..9, A..F *) 


VAR 
C : CARDINAL; 


BEGIN 
C := ORD (CAP (c))7 
RETURN (((C >= Zero) AND (C <= Nine)) OR 
((C >= hexA) AND (C <= hexF))); 
END ISHEX; 
PROCEDURE GetHEX (c : CHAR) : INTEGER; 


(* returns HEX value of character *) 


VAR 
C : CARDINAL; 


BEGIN 
C := ORD (CAP (c))# 
IF C < hexA THEN 
RETURN INTEGER (C - Zero); 
ELSE 
RETURN 10 + INTEGER (C - hexA); 
END; 
END GetHEX; 
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PROCEDURE StringToLong (S : ARRAY OF CHAR; VAR A : LONG) 
(* Converts a string (in HEX) into a LONG *) 


VAR 
i, j : CARDINAL; 


BEGIN 
LongClear (A); 


IF S[0] # '$' THEN 


RETURN FALSE; (* not a HEX string *) 
ELSE 
i S42 13 
WHILE (IsHEX (S[j])) AND (j <= DIGITS) DO 
INC (3); 
END; 
DEC (3); (* gone too far, so back up one *) 
i:= 1; 
WHILE j > 0 DO 
Ali] := GetHEX (S[j]); 
INC (i); DEC (j)? 
END; 
IF A[i - 1] > 7 THEN (* sign extend *) 
FOR j := i TO DIGITS DO 
A(j] := 15; 
END; 
END; 
RETURN (i > 1); 
END; 


END StringToLong; 


PROCEDURE BinStrToLong (S : ARRAY OF CHAR; VAR A : LONG) 


BOOLEAN; 


BOOLEAN; 


(* Converts a string (in Binary, maximum of 16 bits) into a LONG *) 


CONST 
MAXBit = 16; 


VAR 
Bin, i : CARDINAL; 
Neg : BOOLEAN; 


BEGIN 
IF S[0] # '%' THEN 
RETURN FALSE; 


END; 
IF S{1j = "L' THEN 
Neg := TRUE; 
ELSE 
Neg := FALSE; 
END; 
Bin := 0; 
is= 1; 


WHILE S[i] # OC DO 
IF i > MAXBit THEN 
RETURN FALSE; 
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END; 

Bin := Bin * 2; 

IF S[{i] = '0' THEN 
(* No Action Needed *) 

ELSIF S{iJ = '1' THEN 
Bin := Bin + 1; 

ELSE (* Not a valid binary digit *) 
RETURN FALSE; 

END; 

ING (i) + 

END; 


CardToLong (Bin, A); 


IF Neg THEN (* sign extend *) 

i := DIGITS; 
WHILE A[i] = 0 DO 

Ali] := 15; 

DEC (i); 
END; 
IF A[{i] < 8 THEN 

IF A[i] < 4 THEN 

IF A[i] < 2 THEN 


Afi t= ATi) + 25 
END; 
A[i] := Afi] + 4 
END; 
Ali] := A[i] + 8; 
END; 


END; 


RETURN TRUE; 
END BinStrToLong; 


PROCEDURE AddrBoundL (VAR A : LONG); 
(* Forces A to a long word boundary *) 
BEGIN 
WHILE NOT (CARDINAL (A[1]) IN {0, 4, 8, 12}) DO 
LongiInc (A, 1)? 
END; 
END AddrBoundL; 


PROCEDURE AddrBoundW (VAR A : LONG); 
(* Forces A to a word boundary *) 
BEGIN 
WHILE NOT (CARDINAL (A{1]) IN {0, 2, 4, 6, 8, 10, 12, 
LongInc (A, 1); 
END; 
END AddrBoundw; 


END LongNumbers. 


14}) 


DO 
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Listing 10.3 


IMPLEMENTATION MODULE Parser; 
(* Reads the Source file, and splits each *) 
(* line into Label, OpCode & Operand(s). *) 


FROM Strings IMPORT 
STRING; 


FROM Files IMPORT 
FILE, EOF, Read; 


FROM ErrorX68 IMPORT 
ErrorType, Error; 


IMPORT ASCII; 


(*--- 
(* These objects are declared in the DEFINITION MODULE *) 
CONST 
TokenSize = 8; 
OperandSize = 20; 


TYPE 
TOKEN = ARRAY [0..TokenSize] OF CHAR; 
OPERAND = ARRAY [0..OperandSize] OF CHAR; 


VAR 
OpLoc, SrcLoc, DestLoc : CARDINAL; (* location of line parts *) 
Line : STRING; 


LineCount : CARDINAL; 
---*) 


PROCEDURE GetLine (f : FILE; VAR EndFile : BOOLEAN) ; 


(* Inputs a Line -- up to 80 characters ending in cr/lf -- froma file. 


CONST 
MAXLINE = 80; 


VAR 
i: CARDINAL; 
c 2 CHAR; 


BEGIN (* GetLine *) 


A, =O 
LOOP 
IF EOF (f) THEN 
EndFile := TRUE; 
EXIT; 
END; 


Read (f, c); 
IF (c = ASCII.1f) OR (i >= MAXLINE) THEN 
EXIT; 
END; 
Line[i] := c; 
INC (i); 
END; 


=) 
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IF Line[i - 1] = ASCII.cr THEN (* Strip cr/lf - terminate with OC *) 
Line[i - 1] := 0C; 

ELSE 
Line[i] := 0C; 

END; 


INC (LineCount) ; 
END GetLine; 


PROCEDURE SplitLine (VAR Label, OpCode : TOKEN; 
VAR SrcOp, DestOp : OPERAND) ; 
(* Separates TOKENs & OPERANDs from Line. *) 


CONST 
Quote = 47C; 
StringMAX = 12; 


VAR 
i, 3 : CARDINAL; 
ParCnt : INTEGER; (* Tracks open parentheses *) 
c: §. CHAR? 
InQuotes : BOOLEAN; 


PROCEDURE White (ch : CHAR) : BOOLEAN; 
BEGIN 
RETURN ((ch = ASCII.ht) OR (ch = ' ')); 
END White; 
PROCEDURE Delimiter (ch : CHAR) : BOOLEAN; 
BEGIN 
RETURN ((ch = 0C) OR 
((NOT InQuotes) AND ((ch = ' ') OR (ch = ASCII.ht)))); 


END Delimiter; 


PROCEDURE OpDelimiter (ch : CHAR) : BOOLEAN; 
BEGIN 
RETURN ((NOT InQuotes) AND (ParCnt = 0) AND (ch = ',')); 


END OpDelimiter; 


PROCEDURE Done (ch : CHAR) : BOOLEAN; 
(* look for start of comment or NULL terminator *) 
BEGIN 
RETURN ((ch = ';') OR (ch = OC) OR ((ch = '*') AND (i = 0))); 
END Done; 


BEGIN (* SplitLine *) 


i:= 0; 

InQuotes := FALSE; 

IF Done (Line[i]) THEN (* look for blank or all-comment line *) 
RETURN; 

END; 


IF White (Line[i]) THEN 


INC (i); 
WHILE White (Line[i]) DO 
INC (i)? (* Skip spaces & tabs *) 


END; 


A 68000 CROSS-ASSEMBLER 


ELSE (* Found a Label *) 
5 am Oy 
c Line[i]; 


WHILE (NOT Delimiter (c)) AND (j < TokenSize) 


Label[j] := CAP (c); 
INC (i)}z INC (3); 
ec := Line[i]; 

END; 

Label[j] := 0C; 

IF j = TokenSize THEN 
Error (i, TooLong) ; 
WHILE NOT Delimiter 

INC (i); 

END; 

END; 

END; 


WHILE White (Line[i]) DO 
INC (i); 
END; 


IF Done (Line[i]) THEN 
RETURN; 

ELSE 
OpLoc := i; 
j := 0; 
c := Line[i]; 


(* terminate Label string *) 


(Line[i]) DO 


(* Found an OpCode *) 


WHILE (NOT Delimiter (c)) AND (j < TokenSize) 
OpCode[j] := CAP (c); 


INC (i); INC (3); 
c := Line[i]; 

END; 

OpCode[j] := OC; 

IF j = TokenSize THEN 
Error (i, TooLong) ; 
WHILE NOT Delimiter 

INC (i); 
END; 
END; 
END; 


WHILE White (Line[i]) DO 
INC (i); 
END; 


IF Done (Line[i]) THEN 


(Line[i]) DO 


(* String Constant *) 


RETURN; 
ELSE (* Found 1st Operand *) 
SrcLoc := i; 
j := 0; 
ParCnt := 0; 
c := Line[i]; 
IF c = Quote THEN 
SrcOp[j] := c; 
INC (i); INC (4); 
REPEAT 
c := Line[i]; 
SrcOp[j] := cF 
INC (i); INC (3)? 


UNTIL (c = Quote) OR 
SrcOp[j] := 0C; 


(j > StringMAX) OR (c 


DO 


DO 


(* Skip remainder of Too-Long Token *) 


(* Skip remainder of Too-Long Token *) 
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IF j > StringMAX THEN 
Error (i, TooLong) ; 


END; 
RETURN; (* second operand not allowed after string constant *) 
ELSE (* Normal Operand *) 


WHILE (NOT Delimiter (c)) 

AND (NOT OpDelimiter (c)) 
AND (j < OperandSize) DO 
IF c = Quote THEN 


InQuotes := NOT InQuotes; (* Toggle Switch *) 
END; 
IF InQuotes THEN 
SrcOp[j] := c 
ELSE 
SrcOp[j] := CAP (c); 
IF c = '(' THEN 
INC (ParCnt); 
END; 
IF c = ')' THEN 
DEC (ParCnt); 
END; 
END; 
INC (i); INC (4)? 
c := Line[i]; 
END; 
SrcOp[j] := 0C; 


IF j = OperandSize THEN 
Error (i, TooLong); 
WHILE (NOT Delimiter (Line[i])) 

AND (NOT OpDelimiter (Line[i])) DO 
INC (i)? (* Skip remainder of Too-Long Operand *) 

END; 

END; 

END; 
END; 


IF NOT OpDelimiter (Line[i]) THEN 
RETURN; (* because only one OPERAND *) 
ELSE (* Found 2nd Operand *) 
INC (i)? (* Skip OpDelimiter (comma) *) 
DestLoc := i; 
j = OF 
c := Line[i]; 
WHILE (NOT Delimiter (c)) AND (j < OperandSize) DO 
DestOp[j] := CAP (c); 
INC (i)? INC (3)? 
c := Line[il]; 
END; 
DestOp[j] := OC; 
IF j = OperandSize THEN 
Error (i, TooLong); 
END; 
END; 
END SplitLine; 


PROCEDURE LineParts (f : FILE; VAR EndFile : BOOLEAN; 
VAR Label, OpCode : TOKEN; 
VAR SrcOp, DestOp : OPERAND); 
(* Reads line, breaks into tokens, on-passes to symbol & code generators *) 
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BEGIN 
Line := ""; 
GetLine (f, EndFile); (* read a line from the file *) 
Label := ""; OpCode := ""; SrcOp := ""; DestOp := ""; 


IF EndFile THEN 
Error (0, EndErr); 
ELSE 
SplitLine (Label, OpCode, SrcOp, DestOp) ; 
END; 
END LineParts; 


BEGIN (* MODULE Initialization *) 
OpLoc := 0; SrcLoc := 0; DestLoc := 0; LineCount := 0 
END Parser. 


Listing 10.4 


IMPLEMENTATION MODULE ErrorX68; 
(* Displays error messages for X68000 cross assembler *) 


FROM Terminal IMPORT 
WriteString, WriteLn; 


IMPORT Terminal; (* for Read/Write *) 


FROM Files IMPORT 
FILE; 


IMPORT Files; (* for Write *) 


FROM Strings IMPORT 
Length; 


FROM Conversions IMPORT 
CardToStr; 


IMPORT ASCII; 


FROM Parser IMPORT 
Line, LineCount; 


(*--- 
TYPE 
ErrorType = (Dummy, TooLong, NoCode, SymDup, Undef, SymFull, Phase, 
ModeErr, OperErr, BraErr, AddrErr, SizeErr, EndErr); 


VAR 


ErrorCount : CARDINAL; 
---*) 


VAR 
FirstTime : BOOLEAN; 
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PROCEDURE FileWriteString (f 


VAR 
a CARDINAL; 
BEGIN 
i := 0; 
WHILE Str[i] # OC DO 
Files.Write (f, Str[i] 
INC (i); 
END; 
END FileWriteString; 


PROCEDURE Error (Pos 


CARDINAL; ErrorNbr 
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FILE; VAR Str ARRAY OF CHAR) ; 


ye 


ErrorType) ; 


(* Displays Error #ErrorNbr, then waits for any key to continue *) 


VAR 
i: CARDINAL; 
c : CHAR; 
CntStr ARRAY [0..6] OF CHAR; 
dummy : BOOLEAN; 
BEGIN 

WriteLn; 
dummy := CardToStr (LineCount, CntStr); 
WriteString (CntStr); 
WriteString (" ial 
WriteString (Line); WriteLn; 
(* Make up for LineCnt so * in right spot *) 
FOR i := 1 TO Length (CntStr) DO 

Terminal.Write (' '); 
END; 
WriteString (" Ls Jar 
IF Pos > 0 THEN 

FOR i := 1 TO Pos DO 

Terminal.Write (' '); 

END; 

Terminal.Write ('*'); WriteLn; 
END; 
CASE ErrorNbr OF 

TooLong : WriteString ("Identifier too long -- Truncated!"); 
| NoCode : WriteString ("No such op-code."); 
! SymDup WriteString ("Duplicate Symbol."); 
| Undef : WriteString ("Undefined Symbol."); 
| SymFull : WriteString ("Symbol Table Full -- Maximum = 500!"); 

WriteLn; 
WriteString ("Assembly Terminated."); WriteLn; 


HALT; 


| Phase 

| ModeErr 
| OperErr 
| BraErr $ 
| AddrErr 
| SizeErr : 
| EndErr : 
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WriteString 
WriteString 
WriteString 
WriteString 
WriteString 
WriteString 
WriteString 


WriteString ("Unknown 


END; 
WriteLn; 


IF FirstTime THEN 
WriteString ("Hit any 
Terminal.Read (c); 


WriteLn; 


FirstTime 


ELSE 


:= FALSE; 


Terminal.Read (c); 


END; 


IF c = ASCII.etx THEN 


WriteString 


HALT; 
END; 


INC (ErrorCount) ; 


IF ErrorCount > 500 THEN 


("Assembly Terminated by Operator."); 


("Pass 1/Pass 2 Address Count Mis-Match."); 
("This addressing mode not allowed here."); 


("Error in operand format."); 
("Error in relative branch."); 
("Address mode error."); 
("Operand size error."); 
("Missing END Pseudo-Op."); 


Error.) % (* should never get here! 


key to continue (*C to Terminate) 


WriteString ("Too many errors!"); WriteLn; 
WriteString ("Assembly Terminated."); WriteLn; 
HALT; 
END; 
END Error; 


PROCEDURE WriteErrorCount (f 
(* Error count output to Console & Listing file *) 


VAR 


CntStr : ARRAY 


FILE); 


{O..6] OF CHAR; 


MsgO : ARRAY [0..25] OF CHAR; 
Msgl : ARRAY [0..10] OF CHAR; 
Msg2 : ARRAY [0..20] OF CHAR; 
dummy : BOOLEAN; 


BEGIN 
Msg0 := "---> 
Msg] r= "===> or) 
Msg2 := " ASSEMBLY 


dummy := CardToStr 


END OF ASSEMBLY"; 


ERROR (S) ."; 
(ErrorCount, CntStr); 


(* Messages to console *) 


WriteLn; 
WriteLn; 
WriteString 
WriteString 
WriteString 
WriteString 
WriteLn; 


(Msg0O) ; WriteLn; 


(Msg1l) ; 
(CntStr) ; 
(Msg2) ; 


WriteLn; 


=) 


ae 
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(* Messages to listing file *) 
Files.Write (f, ASCII.cr); 
Files.Write (f, ASCII.1f); 
Files.Write (f, ASCII.cr); 
Files.Write (f, ASCII.1f); 


FileWriteString (f, Msg0); 
Files.Write (f, ASCII.cr); 
Files.Write (f, ASCII.1f£); 


FileWriteString (f, Msgl); 
FileWriteString (f, CntStr); 
FileWriteString (f, Msg2); 
Files.Write (f, ASCII.cr); 
Files.Write (f, ASCII.1f£); 


Files.Write (f, ASCII.ff); (* feed up next page *) 
END WriteErrorCount; 


BEGIN (* MODULE Initialization *) 
FirstTime := TRUE; 
ErrorCount := 0; 

END Errorx68. 


Listing 10.5 


IMPLEMENTATION MODULE SymbolTable; 
(* Initializes symbol table. Maintains list of all labels, *) 
(* along with their values. Provides access to the list. *) 


FROM LongNumbers IMPORT 
LONG, LongClear; 


FROM Parser IMPORT 
TOKEN; 


FROM Strings IMPORT 


CompareStr; 
CONST 
MAXSYM = 500; (* Maximum entries in Symbol Table *) 
TYPE 
SYMBOL = RECORD 
Name : TOKEN; 
Value : LONG; 
END; 
VAR 


SymTab : ARRAY [1..MAXSYM] OF SYMBOL; 
Next : CARDINAL; (* Array index into next entry in Symbol Table *) 
Top : INTEGER; (* Last used array position as seen by Sort *) 
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PROCEDURE FillSymTab (Label TOKEN; 
(* Add a symbol to the table *) 


Value 


BEGIN 

IF Next <= MAXSYM THEN 
SymTab[Next].Name := Label; 
SymTab[Next].Value := Value; 
INC (Next); 
Full := FALSE; 

ELSE 
Full := TRUE; 

END; 


END FillSymTab; 


LONG; 
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VAR Full : BOOLEAN) ; 


PROCEDURE SortSymTab (VAR NumSyms : CARDINAL); 


(* Sort symbols into alphabetical order *) 
VAR 

i, j, 

Temp : 


gap : INTEGER; 
SYMBOL; 


PROCEDURE Swap; 
BEGIN 
Temp := SymTab[j]; 
SymTab[j] := SymTab[j + gap]; 
SymTab[(j + gap] := Temp; 
END Swap; 


BEGIN (* Bert *) 
Top := Next - 1; 


gap := (Top + 1) DIV 2; 
WHILE gap > 0 DO 
i := gap; 
WHILE i <= Top DO 
j := i - gap; 
WHILE j >= 1 DO 


IF CompareStr (SymTab[j] .Name, 


Swap; 
END; 
j = j - gap; 
END; 
INC (i); 
END; 
gap := gap DIV 2; 
END; 
NumSyms := Top; 


END SortSymTab; 


PROCEDURE ReadSymTab (LABEL ARRAY OF CHAR; 
VAR Value LONG; 


VAR Duplicate : 


(* Shell Sort causes j to go negative *) 


SymTab[j + gap].Name) > 0 THEN 


BOOLEAN) : BOOLEAN; 


(* Passes Value of Label to calling program -- returns FALSE if the *) 


(* Label is not defined. 


CONST 
GoLower = -1; 
GoHigher = +1; 


Also checks for Multiply Defined Symbols *) 
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VAR 
i, mid INTEGER; 

Search : INTEGER; 

Found BOOLEAN; 

c : CHAR; 


Label TOKEN; 


i, 


BEGIN 
LongClear (Value); 
Duplicate := FALSE; 
2 = 07 
REPEAT 
c := LABEL[i]; 
Label [i] 
INC (i); 
UNTIL (c = 


:= Cc} 
OC) OR (i 
IF c # OC THEN 


RETURN FALSE; 
END; 


(* 


i:=1; 

j := Top; 

Found := FALSE; 

REPEAT 
mid s= (i + 4) 

Search := 


IF Search = 
j = mid - 1; 
ELSIF Search = 


i := mid + 1; 
ELSE (* Got It! 

Found := TRUE; 
END; 


UNTIL (j < i) 


IF Found THEN 
IF mid > 1 THEN 
IF CompareStr 
Duplicate 
END; 
END; 
IF mid < Top THEN 
IF CompareStr 
Duplicate 
END; 
END; 


Value 
RETURN TRUE; 
ELSE 
RETURN FALSE; 
END; 
END ReadSymTab; 


s= TRUBS 


7= TRUE; 
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> 8); 


Operand label too long --> Undefined *) 


(* Binary search *) 
DIV 2; 
CompareStr 


(Label, SymTab[mid] .Name) ; 


GoLower THEN 


GoHigher THEN 


*) 


OR Found; 


(SymTab[mid].Name, SymTab[mid - 1] .Name) 


(* Multiply Defined Symbol *) 


(SymTab[mid].Name, SymTab[{mid + 1].Name) 


(* Multiply Defined Symbol *) 


:= SymTab [mid] .Value; 


QO THEN 


0 THEN 
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PROCEDURE ListSymTab (i : CARDINAL; VAR Label : 
(* Returns the i-th item in the symbol table *) 


BEGIN 
IF i < Next THEN 
Label := SymTab[i] .Name; 
Value := SymTab[i].Value; 
END; 


END ListSymTab; 


BEGIN (* MODULE Initialization *) 
FOR Next := 1 TO MAXSYM DO 
SymTab[Next].Name := ""; 
LongClear (SymTab[Next] .Value) ; 
END; 


Top := 0; 


Next := 1; 
END SymbolTable. 


Listing 10.6 


IMPLEMENTATION MODULE OperationCodes; 
(* Initializes lookup table for Mnemonic OpCodes. 


TOKEN; VAR Value 


Searches the table 


(* and returns the bit pattern along with address mode information. 


FROM Files IMPORT 
FILE, FileState, Open, ReadRec, Close; 


FROM Terminal IMPORT 
WriteString, WriteLn; 


FROM Strings IMPORT 
STRING, CompareStr; 


FROM Parser IMPORT 
TOKEN; 


FROM ErrorX68 IMPORT 
ErrorType, Error; 


CONST 
FIRST = 1; (* First 68000 OpCode *) 
LAST = 118; (* Last 68000 OpCode *) 
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*) 
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(*--- 


(* These objects are declared in the DEFINITION MODULE *) 


TYPE 

ModeTypeA = (RegMem3, (* 0 = Register, 1 = Memory *) 
Ry02, (* Register Rx -- Bits 0-2 *) 
Rx911, (* Register Ry -- Bits 9-11 *) 
Data911, (* Immediate Data -- Bits 9-11 *) 
CntR911, (* Count Register or Immediate Data *) 
Brnch, (* Relative Branch *) 
DecBr, (* Decrement and Branch *) 
Data03, (* Used for VECT only *) 
Data07, (* MOVEQ *) 
OpM68D, (* Data *) 
OpM68A, (* Address *) 
OpM68C, (* Compare *) 
OpM68X, (* XOR *) 
OpM68S, (* Sign Extension *) 
OpM68R, (* Register/Memory *) 
OpM37) ; (* Exchange Registers *) 

ModeTypeB = (Bit811, (* BIT operations - bits 8/11 as switch *) 
Size67, (* 00 = Byte, 01 = Word, 10 = Long *) 
Size6, (* 0 = Word, 1 = Long *) 
Sizel213A, (* 01 = Byte, 11 = Word, 10 = Long *) 
Sizel213, (* 11 = Word, 10 = Long *) 
Exten, (* OpCode extension required *) 
EAO5a, (* Effective Address - ALL *) 
EA05b, (* Less 1 *) 
EAO5c, (* Less 1, 11 *) 
EAOSd, (* Less 9, 10, 11 *) 
EAOSe, (* Less 1, 9, 10, 11 *) 
EAOS£, (* Less. 0, ly 3% 4, LL *) 
EAO5x, (* Dual mode - OR/AND *) 
EAOS5y, (* Dual mode - ADD/SUB *) 
EA05z, (* Dual mode - MOVEM *) 
EA611); (* Used only by MOVE *) 

ModeA = SET OF ModeTypeA; 

ModeB = SET OF ModeTypeB; 

===) 
TYPE 


TableRecord = RECORD 
Mnemonic : TOKEN; 
Op : BITSET; 
AddrModeA : ModeA; 
AddrModeB : ModeB; 


END; 
VAR 
Table68K : ARRAY [FIRST..LAST] OF TableRecord; 
i: CARDINAL; (* index variable for initializing Table68K * 
£ 2 PILES 


PROCEDURE Instructions (MnemonSym : TOKEN; 

OpLoc : CARDINAL; VAR Op : BITSET; 

VAR AddrModeA : ModeA; VAR AddrModeB : ModeB) ; 
(* Uses lookup table to find addressing mode & bit pattern of opcode. *) 
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CONST 
GoLower = -1; 
GoHigher = +1; 


VAR 
Top, Bottom, Look : CARDINAL; (* index to Op-code table *) 
Found : BOOLEAN; 
Search : INTEGER; 


BEGIN 
Bottom := FIRST; 
Top := LAST; 
Found := FALSE; 


REPEAT (* Binary Search *) 
Look := (Bottom + Top) DIV 2; 
Search := CompareStr (MnemonSym, Table68K[Look] .Mnemonic) ; 


IF Search = GoLower THEN 
Top := Look - 1; 
ELSIF Search = GoHigher THEN 


Bottom := Look + 1; 
ELSE (* Got. Tt! *) 
Found := TRUE; 
END; 


UNTIL (Top < Bottom) OR Found; 


IF Found THEN 
(* Return the instruction, mode, and address restristictions *) 
Op := Table68K[Look] .Op; 


AddrModeA := Table68K[Look] .AddrModeA; 
AddrModeB := Table68K [Look] .AddrModeB; 
ELSE 
Error (OpLoc, NoCode) ; 
END; 


END Instructions; 


BEGIN (* MODULE Initialization *) 


IF Open (f, "OPCODE.DAT") # FileOK THEN (* Try default drive first, *) 
IF Open (f, "A:OPCODE.DAT") # FileOK THEN (* then check Drive A, *) 
IF Open (f, "B:OPCODE.DAT") # FileOK THEN (* and finally Drive B. 
WriteString ("Can't Find 'OPCODE.DAT'."); 
WriteLn; 
HALT; 
END; 
END; 
END; 
FOR i := FIRST TO LAST DO (* read 68000 data table *) 
ReadRec (f, Table68K[i]); 
END; 


IF Close (f) # FileOK THEN 
(* Don't worry about it! *) 
END; 
END OperationCodes. 


* 
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Listing 10.7 


MODULE InitOperationCodes; 
(* Module to construct the file containing the Operation Code Data Table *) 


FROM Files IMPORT 
FILE, FileState, Create, WriteRec, Close; 


FROM Terminal IMPORT 
WriteString, WriteLn; 


FROM Parser IMPORT 
TOKEN; 


FROM OperationCodes IMPORT 
ModeTypeA, ModeTypeB, ModeA, ModeB; 


CONST 
FIRST = 1; 
LAST = 118; 


(*--- 


(* These Objects Imported from the DEFINITION MODULE of OperationCodes *) 


TYPE 

ModeTypeA = (RegMem3, (* 0 = Register, 1 = Memory *) 
Ry02, (* Register Ry -- Bits 0-2 *) 
Rx911, (* Register Rx -- Bits 9-11 *) 
Data911, (* Immediate Data -- Bits 9-11 *) 
CntR911, (* Count Register or Immediate Data *) 
Brnch, (* Relative Branch *) 

DecBr, (* Decrement and Branch *) 
Data03, (* Used for VECT only *) 
Data07, (* Branch & MOVEQ *) 
OpM68D, (* Data *) 

OpM68A, (* Address *) 

OpM68C, (* Compare *) 

OpM68X, (* XOR *) 

OpM68S, (* Sign Extension *) 
OpM68R, (* Register/Memory *) 
OpM37) ; (* Exchange Registers *) 

ModeTypeB = (Bit811, (* BIT operations - bits 8/11 as switch *) 
Size67, (* 00 = Byte, 01 = Word, 10 = Long *) 
Size6, (* 0 = Word, 1 = Long *) 

Size1213A, (* 01 = Byte, 11 = Word, 10 = Long *) 
Sizel1213, (* 11 = Word, 10 = Long *) 
Exten, (* OpCode extension required *) 
EAO5a, (* Effective Address - ALL *) 

EA05b, (* Less 1 *) 

EAQ5Sc, (* Less 1, 11 *) 

EA05d, (* Less 9, 10, 11 *) 

EAO5e, (* Less 1, 9, 10, 21 *) 

EAOS£, (* Less 0, 1, 3, 4, 11 *) 

EAO5x, (* Dual mode - OR/AND *) 

EAO5y, (* Dual mode - ADD/SUB *) 

EAO5z, (* Dual mode - MOVEM *) 


EA611); (* Used only by MOVE *) 
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ModeA = SET OF ModeTypeA; 
ModeB = SET OF ModeTypeB; 
TYPE 


TableRecord = RECORD 
Mnemonic : TOKEN; 
Op : BITSET; 
AddrModeA : ModeA; 
AddrModeB : ModeB; 
END; 


VAR 


Table68K : ARRAY [FIRST..LAST] OF TableRecord; 
i: CARDINAL; (* index variable for initializing Table68K *) 


£ : FILE; (* "OPCODE.DAT" *) 


(* TableRecord & Table68K are identical to structures declared 


(* in the IMPLEMENTATION MODULE of OperationCodes 

(* a a a a a a nn wn wn wr ww a ee ww es ee 
BEGIN (* InitOperationCodes *) 

i, s= 1; 


WITH Table68K[i] DO 
Mnemonic := "ABCD"; 
Op := {15, 14, 8}; 
AddrModeA := ModeA{Rx911, RegMem3, Ry02}; 
AddrModeB := ModeB{}; 
END; 


ING? (2) ¢ 

WITH Table68K[i] DO 
Mnemonic := "ADD"; 
Op := {15, 14, 12); 
AddrModeA := ModeA{OpM68D}; 


AddrModeB := ModeB{EA05y}; 
END; 
INC (i); 
WITH Table68K[i] DO 

Mnemonic := "ADDA"; 


Op := {15, 14, 12}; 

AddrModeA := ModeA{OpM68A}; 

AddrModeB := ModeB{EA05a}; 
END; 


INC (i); 
WITH Table68K[i] DO 

Mnemonic := “ADDI"; 

Op := {10, 9}; 

AddrModeA ModeA{}; 

AddrModeB ModeB{Size67, EAOSe, Exten}; 
END; 
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INC (i); 
WITH Table68K[i]} DO 
Mnemonic := "ADDQ"; 
Op := {14, 12}; 
AddrModeA := ModeA{Data911}; 
AddrModeB := ModeB{Size67, EA05d}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "ADDX"; 


Op := {15, 14, 12, 8}; 
AddrModeA := ModeA{RegMem3, Rx911, Ry02}; 
AddrModeB := ModeB{Size67}; 

END; 


INC (i)? 
WITH Tableé68K[i] DO 
Mnemonic := "AND"; 
Op := {15, 14}; 
AddrModeA := ModeA{OpM68D}; 
AddrModeB := ModeB{EA05x}; 


END; 

ENG (4) ¥ 

WITH Table68K[i] DO 
Mnemonic := "ANDI"; 
Op t= {9)}¥ 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e, Size67, Exten}; 

END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "ASL"; 
Op := {15, 14, 13, 8}; 


AddrModeA := ModeA{CntR911}; 
AddrModeB := ModeB{}; 


END; 

INC (i); 

WITH Tableé8K[i] DO 
Mnemonic := "ASR"; 


Op := {15, 14, 13}; 
AddrModeA := ModeA{CntR911}; 
AddrModeB := ModeB{}; 

END; 


INC (i); 

WITH Table68K[i] DO 
Mnemonic := "BCC"; 
Op := {14, 13, 10}; 
AddrModeA := ModeA{Brnch}; 
AddrModeB := ModeB{}; 

END; 
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INC (i); 

WITH Table68K[i] DO 
Mnemonic := "BCHG"; 
Op := {6}; 


AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e, Exten, 
END; 


INC (i); 
WITH Table68K{i] DO 
Mnemonic := "BCLR"; 
Op := {7}; 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e, Exten, 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "BCS"; 
Op := {14, 13, 10, 8}; 
AddrModeA := ModeA{Brnch}; 
AddrModeB := ModeB{}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "BEQ"; 


Op x= (14, 13, 10, 9,, -8}9 
AddrModeA := ModeA{Brnch}; 
AddrModeB := ModeB{}; 


END; 

INC (i); 

WITH Table6é8K[i] DO 
Mnemonic := "BGE"; 
Op := {14, 13, 11, 10}; 
AddrModeA := ModeA{Brnch)}; 
AddrModeB := ModeB{}; 

END; 

INC (i)? 

WITH Table6é8K[i] DO 
Mnemonic := "BGT"; 


Op := {14, 13, 11, 10, 9}; 


AddrModeA := ModeA{Brnch); 
AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "BHI"; 
Op := {14, 13, 9}; 
AddrModeA := ModeA{Brnch}; 


AddrModeB := ModeB{}; 
END; 


Bit811}; 


Bit811}; 
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INC (i)? 
WITH Table68K[i] DO 
Mnemonic := "BLE"; 


Op v= {14, 13, Ll, 10, 9, Bh¢ 
AddrModeA := ModeA{Brnch}; 
AddrModeB := ModeB{}; 


END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "BLS"; 
Op := {14, 13, 9, 8}; 
AddrModeA := ModeA{Brnch}; 
AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Tableé68K[i] DO 
Mnemonic := "BLT"; 


Op := {14, 13, 11, 10, 8}; 
AddrModeA := ModeA{Brnch}; 
AddrModeB := ModeB{}; 

END; 


INC (i); 

WITH Table6é8K[i] DO 
Mnemonic := "BMI"; 
Op: s= {14,. 13), 11, 9, 87 
AddrModeA := ModeA{Brnch}; 
AddrModeB := ModeB{}; 


END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "BNE"; 


Op := {14, 13, 10, 9}; 
AddrModeA := ModeA{Brnch}; 
AddrModeB := ModeB{}; 


END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "BPL"; 


Op := {14, 13, 11, 9}; 
AddrModeA := ModeA{Brnch}; 
AddrModeB := ModeB{}; 


END; 
INC (i); 
WITH Table6é8K[i] DO 
Mnemonic := "BRA"; 
Op := {14, 13}; 
AddrModeA := ModeA{Brnch}; 
AddrModeB := ModeB{}; 


END; 
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INC (i)? 

WITH Table68K[i] DO 
Mnemonic := "BSET"; 
Op := {7, 6}7 


AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e, Exten, Bit811}; 
END; 


ING (i); 

WITH Table68K[i] DO 
Mnemonic := "BSR"; 
Op := {14, 13, 8}; 


AddrModeA := ModeA{Brnch}; 
AddrModeB := ModeB{}; 


END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "BTST"; 
Op := {)i 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA0Sc, Exten, Bit811}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "BVC"; 
Op := {14, 13, 11}; 
AddrModeA := ModeA{Brnch}; 
AddrModeB 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "BVS"; 
Op ¢= {14, 13, EL, 8h 
AddrModeA := ModeA{Brnch}; 
AddrModeB := ModeB{}; 
END; 
INC (i)? 
WITH Table68K[i] DO 
Mnemonic := "CHK"; 
Op := {14, 8, 7}; 
AddrModeA := ModeA{Rx911}; 
AddrModeB := ModeB{EA05b}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "CLR"; 
Op := {14, 9}; 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{Size67, EAQ5e}; 


END; 
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INC (i); 
WITH Table68K[i] DO 
Mnemonic := "CMP"; 


Op := {15, 13, 12); 
AddrModeA := ModeA{OpM68C}; 
AddrModeB := ModeB{EA05a}; 


END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "CMPA"; 


Op := {15, 13, 12}; 
AddrModeA := ModeA{OpM68A}; 


AddrModeB := ModeB{EA05a}; 
END; 
INC (i); oR 
WITH Table68K[i] pO~, 
Mnemonic := "CMPI'g 


Op = {11, 10}; , + 
AddrModeft::= Mode fy ; 
Hostages := ModeBfSize67, EA05e, Exten}; 


END; re! 

INC (i); wart” 

WITH Tableé8K[i] DO 
Mnemonic := "CMPM"; 
Op: = -£15,. 23; 12, Se 334 
AddrModeA := ModeA{Rx911, Ry02}; 
AddrModeB := ModeB{Size67}; 

END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "DBCC"; 
Op s= 414, 12, 10; 7, 6, 33% 
AddrModeA := ModeA{DecBr}; 
AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "DBCS"; 
Op s= {24, 12, WO,n Se Ty Gy 33 
AddrModeA := ModeA{DecBr}; 
AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "DBEQ"; 


Op: = -({14,. 124 10x 99) i8y Vs Cy Bd? 
AddrModeA := ModeA{DecBr}; 
AddrModeB := ModeB{}; 

END; 
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INC (i); 

WITH Table68K[i] DO 
Mnemonic := "DBF"; 
Op ¢= {14, 12, 8% Tr 6, 3% 
AddrModeA := ModeA{DecBr}; 
AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Tableé68K[i] DO 
Mnemonic := "DBGE"; 


Op := {14, 12, 11, 10, 7, 6, 3}% 
AddrModeA := ModeA{DecBr}; 
AddrModeB := ModeB{}; 


END; 

INC (i); 

WITH Tableé68K[i] DO 
Mnemonic := "DBGT"; 


Op := {14, 12, 11, 10, 9, 7, 6, 3}: 
AddrModeA := ModeA{DecBr}; 
AddrModeB := ModeB{}; 


END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := “DBHI"; 


Op := {14, 12, 9, 7, 6, 3}3 


AddrModeA ModeA {DecBr}; 
AddrModeB := ModeB{}; 

END; 

INC (i)? 

WITH Table6é8K[i] DO 
Mnemonic := "DBLE"; 


Op s= {14, 12, 11, 10, 9, 8, 7, ©, 37 
AddrModeA := ModeA{DecBr}; 


AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "DBLS"; 
Op := {14, 12, 9, 8, 7, 6, 3}3 
AddrModeA := ModeA{DecBr}; 
AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "DBLT"; 
Op += {14, 12; 11, 10, 8% Fe Gy She 
AddrModeA := ModeA{DecBr}; 


AddrModeB := ModeB{}; 
END; 
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INC (i); 

WITH Table68K[i] DO 
Mnemonic := "“DBMI"; 
Op := {14, 12, 11, 9, 8, 7, 6, 3}% 
AddrModeA := ModeA{DecBr}; 
AddrModeB := ModeB{}; 


END; 

INC (i); 

WITH Tableé68K[i] DO 
Mnemonic := "DBNE"; 
Op := {14, 12, 10, 9, 7, 6, 3); 
AddrModeA := ModeA{DecBr}; 
AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Table68K[i] DO, 
Mnemonic := "DBPL"; 
Op := (14, 12, 11:9, Tr 6, 3)? 
AddrModeA “| ModeA{DecBr}; 
AddrModeB := ModeB{}; 

END; sf rie 

INC (i); 

WITH Table68K[i]} DO 
Mnemonic := "DBRA"; 
Op := {14, 12, 8, 7, 6, 3}3 
AddrModeA := ModeA{DecBr}; 
AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "DBT"; 
Op := {14, 12, 7, 6, 3} 
AddrModeA := ModeA{DecBr}; 
AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Tableé68K[i] DO 
Mnemonic := "DBVC"; 
Op := {14, 12, 11, 7, 6, 3); 
AddrModeA := ModeA{DecBr}; 
AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Table68K[i} DO 
Mnemonic := "DBVS"; 
Op := {14, 12, 11, 8, 7, 6, 3}% 


AddrModeA := ModeA{DecBr}; 
AddrModeB := ModeB{}; 
END; 
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INC (i); 
WITH Table68K[i] DO 
Mnemonic := "DIVS"; 


Op t= (U5, 8, 7, 6ke 

AddrModeA := ModeA{Rx911}; 

AddrModeB := ModeB{EA05b}; 
END; 


INC (i); 

WITH Table68K[i] DO 
Mnemonic := "DIVU"; 
Op := {15, 7, 6}; 


AddrModeA := ModeA{Rx911}; 
AddrModeB := ModeB{EA05b}; 
END; 


INC (i); 
WITH Table68K[i] DO 
Mnemonic := "EOR"; 
Op := {15, 13, 12}; 
AddrModeA := ModeA{OpM68X}; 
AddrModeB := ModeB{EA05e}; 
END; 


INC (i); 
WITH Table68K[i] DO 
Mnemonic := "EORI"; 


Op s= {11, 9}e 

AddrModeA := ModeA{}; 

AddrModeB := ModeB{Size67, EAQ5e, 
END; 


INC (i); 
WITH Table68K[i] DO 
Mnemonic := "EXG"; 


Op := {15, 14, 8}; 
AddrModeA := ModeA{0OpM37}; 
AddrModeB := ModeB{}; 

END; 


INC (i); 
WITH Table68K[{i] DO 
Mnemonic := "EXT"; 
Op := {14, 11}; 
AddrModeA := ModeA{OpM68S}; 
AddrModeB := ModeB{}; 


END; 
INC (i); 
WITH Table68K[{i] DO 
Mnemonic := "ILLEGAL"; 
Op: = {14, 21, 95 7) 64 Sy Gs 35 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{}; 


END; 


Exten}; 


2h3 
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INC (i); 
WITH Table68K[i] DO 
Mnemonic := "JMP"; 
Op := {14, 11, 10, 9, 7, 6}¢ 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05f}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "JSR"; 
Op := {14, 11, 10, 9, 7}% 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05f}; 
END; 
INC (i)? 
WITH Table68K[i] DO 
Mnemonic := "LEA"; 


Op := {14, 8, 7, 6}7 
AddrModeA := ModeA{Rx911}; 
AddrModeB := ModeB{EA0Sf}; 


END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "LINK"; 
Op := {14, 11, 10, 9, 6, 4}3% 
AddrModeA := ModeA{Ry02}; 
AddrModeB := ModeB{Exten}; 
END; 
INC: (3) % 
WITH Table68K[i] DO 
Mnemonic := "LSL"; 


Op t= (15, 14, 13) 9. Br Sti 


AddrModeA := ModeA{CntR911}; 
AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "LSR"; 


Op := {15, 14, 13, 9, 3}% 


AddrModeA := ModeA{CntR911}; 
AddrModeB := ModeB{}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "MOVE"; 
Op := {}3 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{Sizel213A, EA611}; 


END; 
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INC (i); 
WITH Table68K[i] DO 
Mnemonic := "MOVEA"; 
Op := {6}; 
AddrModeA := ModeA{Rx911}; 
AddrModeB := ModeB{Sizel213, EA05a}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "MOVEM"; 
Op := {14, 11, 7}; 


AddrModeA := ModeA{ }; 
AddrModeB := ModeB{Size6, EAQ5z, Exten}; 
END; 


INC (i); 

WITH Table68K[i] DO 
Mnemonic := "MOVEP"; 
Op := {3}; 


AddrModeA := ModeA{OpM68R}; 
AddrModeB ModeB{Exten}; 
END; 


INC (i); 

WITH Table68K[i] DO 
Mnemonic := "MOVEQ"; 
Op := {14, 13, 12}; 
AddrModeA := ModeA{Data07}; 
AddrModeB := ModeB{}; 


END; 
INC (i); 
WITH Table68K[i]} DO 
Mnemonic := "MULS"; 
Op := {15, 14, 8, 7, 6}; 
AddrModeA := ModeA{Rx911}; 
AddrModeB := ModeB{EA05b}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "MULU"; 


Op := {15, 14, 7, 6}; 
AddrModeA := ModeA{Rx911}; 


AddrModeB := ModeB{EA05b}; 
END; 
INC (i); 
WITH Table68K[i] DO 

Mnemonic := "NBCD"; 

Op := {14, 11}; 


AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e}; 
END; 
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INC (i); 

WITH Table68K[i] DO 
Mnemonic := "NEG"; 
Op := {14, 10}; 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{Size67, EAO5e}; 

END; 

INC (i); 

WITH Tableé8K[i] DO 
Mnemonic := "NEGX"; 
Op := {14}; 


AddrModeA := ModeA{}; 
AddrModeB := ModeB{Size67, EAQ5e}; 


END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "NOP"; 
Op s= [14, 11, IO, 9, 6, 5, 4, O}7 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{}; 
END; 
INC (i); 
WITH Tableé68K[i] DO 
Mnemonic := "NOT"; 
Op := {14, 10, 9};7 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{Size67, EAQ5e}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "OR"; 
Op := {15}; 
AddrModeA := ModeA{OpM68D}; 
AddrModeB := ModeB{EA05x}; 
END; 
INC (i); 
WITH Table68K{i]} DO 
Mnemonic := "ORI"; 
Op := {}3 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{Size67, EAQS5e, Exten}; 
END; 
ING (1)7 
WITH Table68K[i] DO 
Mnemonic := "PEA"; 
Op := {14, 11, 6}; 
AddrModeA := ModeA{}; 


AddrModeB := ModeB{EA05Sf£}; 
END; 
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INC (i); 

WITH Table68K[i] DO 
Mnemonic := "RESET"; 
Op s= {14, 11, 10, 9, 6, 5, 4}; 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{}; 

END; 

INC (i)? 

WITH Table6é8K[i] DO 
Mnemonic := "ROL"; 


Op := {15, 14, 13, 10, 9, 8, 4, 3}; 
AddrModeA := ModeA{CntR911}; 


AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "ROR"; 
Op := {15, 14, 13, 10, 9, 4, 3}; 
AddrModeA := ModeA{CntR911}; 
AddrModeB := ModeB{}; 

END; 

INC (i)? 

WITH Table68K[i] DO 
Mnemonic := "ROXL"; 


Op := {15, 14, 13, 10, 8, 4}; 
AddrModeA := ModeA{CntR911}; 


AddrModeB := ModeB{}; 

END; 

INC (i)? 

WITH Table68K[i] DO 
Mnemonic := "ROXR"; 
Op := {15, 14, 13, 10, 4); 
AddrModeA := ModeA{CntR911}; 
AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Table68K[i]} DO 
Mnemonic := "RTE"; 
Op 2= (14, Ty 10, 9) 6; Sy Ay Dy 1036 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Tableé68K[i] DO 
Mnemonic := "RTR"; 
Op s= (14, 11, 10, 95 6; Sy 45 2e Ly Oke 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{}; 

END; 
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INC (i); 
WITH Table68K[i] DO 
Mnemonic := "RTS"; 
Op := {14, 11, 10, 9, 6, 5, 4, 2, O}F 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{}; 
END; 
INC (i); 
WITH Tableé8K[i] DO 
Mnemonic := "SBCD"; 


Op := {15, 8}; 
AddrModeA := ModeA{Rx911, RegMem3, Ry02}; 


AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "SCC"; 
Op := {14, 12, 10, 7, 6}% 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e}; 

END; 

INC (i); 

WITH Tableé6é8K[i] DO 
Mnemonic := "SCS"; 


Op := {14, 12, 10, 8, 7, 6}; 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e}; 


END; 
INC (i)? 
WITH Tableé8K[i] DO 
Mnemonic := "SEQ"; 
Op := {14, 12, 10, 9, 8, 7, 6}; 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "SE"; 


Op := {14, 12, 8, 7, 613 
AddrModeA := ModeA{}; 


AddrModeB := ModeB{EA05e}; 
END; 
INC (i); 
WITH Table6é8K[i] DO 
Mnemonic := "SGE"; 
Op := {14, 12, 11, 10, 7, 6}% 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e}; 


END; 
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INC (i); 
WITH Table68K[i] DO 
Mnemonic := "SGT"; 
Op := {14, 12, 11, 10, 9, 7, 6}; 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e}; 


END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "SHI"; 


Op := {14, 12, 9, 7, 6}; 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e}; 


END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "SLE"; 
Op := {14, 12, 11, 10, 9, 8, 7, 6}; 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e}; 
END; 
INC (i)? 
WITH Table68K[i] DO 
Mnemonic := "SLS"; 
Op := {14, 12, 9, 8, 7, 6}; 


AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e}; 


END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "SLT"; 
Op := {14, 12, 11, 10, 8, 7, 6}; 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "SMI"; 
Op := (14; 12, 11, 9, 8% 7, 6}? 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "SNE"; 


Op := {14, 12, 10, 9, 7, 6} 

AddrModeA := ModeA{}; 

AddrModeB := ModeB{EA0Se}; 
END; 
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INC (i); 
WITH Table68K[i] DO 
Mnemonic := "SPL"; 
Op := {14, 12, 11, 9, 7, 6}3 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05Se}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := “ST"; 
Op := {14, 12; Ty 6)}% 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "STOP"; 
Op := {14, 11, 10, 9, 6, 5, 4, 1} 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{Exten}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "SUB"; 
Op: w= 115, 1234 
AddrModeA := ModeA{OpM68D}; 
AddrModeB := ModeB{EA05Sy}; 
END; 
INC (i)? 
WITH Table68K[i] DO 
Mnemonic := “SUBA"; 
Op := (15, 12}3 
AddrModeA := ModeA{OpM68A}; 
AddrModeB := ModeB{EA05a}; 
END; 
INC (i)? 
WITH Table68K[i] DO 
Mnemonic := "SUBI"; 
Op := {10}; 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{Size67, EAOSe, Exten}; 
END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "SUBQ"; 
Op := {14, 12, 8}; 
AddrModeA := ModeA{Data911}; 


AddrModeB := ModeB{Size67, EAO0Sd}; 
END; 
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INC (i); 
WITH Table68K[i] DO 
Mnemonic := "SUBX"; 


Op := {15, 12, 8}; 
AddrModeA := ModeA{Rx911, RegMem3, Ry02}; 
AddrModeB := ModeB{Size67}; 


END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "SVC"; 


Op := {14, 12, 11, 7, 6}; 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e}; 


END; 
INC (i); 
WITH Table68K[i] DO 
Mnemonic := "SVS"; 
Op := {14, 12, 11, 8, 7, 6}; 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e}; 
END; 
INC (i); 
WITH Table6é8K[i] DO 
Mnemonic := "SWAP"; 


Op := {14, 11, 6}; 

AddrModeA := ModeA{Ry02}; 

AddrModeB := ModeB{}; 
END; 


INC (i); 

WITH Table68K[i] DO 
Mnemonic := "TAS"; 
Op := {14, 11, 9, 7, 6); 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{EA05e}; 


END; 

INC (i); 

WITH Tableé8K[i] DO 
Mnemonic := "TRAP"; 


Op := {14, 11, 10, 9, 6); 


AddrModeA := ModeA{Data03}; 
AddrModeB := ModeB{}; 

END; 

INC (i); 

WITH Table68K[i] DO 
Mnemonic := "TRAPV"; 
Op := (14, 11, 10, 9, 6, 5, 4, 2, 1}; 
AddrModeA := ModeA{}; 


AddrModeB := ModeB{}; 
END; 
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INC (i); 
WITH Table68K[i] DO 
Mnemonic := "TST"; 
Op := {14, 11, 9}; 
AddrModeA := ModeA{}; 
AddrModeB := ModeB{Size67, EAQ05e}; 
END; 
INC (i)? 
WITH Table68K[i] DO 
Mnemonic := “UNLK"; 
Op := {14, 11, 10, 9, 6, 4, 3}; 
AddrModeA := ModeA{Ry02}; 
AddrModeB := ModeB{}; 
END; 


IF Create (f, "OPCODE.DAT") # FileOK THEN 
WriteString ("Unable to Create OpCode File."); 
WriteLn; 

HALT; 

END; 


FOR i := FIRST TO LAST DO 
WriteRec (f, Table68K[i]); 
END; 


IF Close (f) # FileOK THEN 
WriteString ("Unable to Close OpCode File."); 
WriteLn; 
END; 
END InitOperationCodes. 


Listing 10.8 


IMPLEMENTATION MODULE SyntaxAnalyzer; 
(* Analyzes the operands to provide information for CodeGenerator *) 


FROM Conversions IMPORT 
StrToCard; 


FROM Strings IMPORT 
Length; 


FROM LongNumbers IMPORT 
LONG, LongAdd, LongSub, CardToLong, StringToLong, BinStrToLong; 


FROM SymbolTable IMPORT 
SortSymTab, ReadSymTab; 


FROM ErrorX68 IMPORT 
ErrorType, Error; 


FROM Parser IMPORT 
OPERAND, SrcLoc; 


FROM CodeGenerator IMPORT 
LZero, AddrCnt, Pass2; (* BOOLEAN Switch *) 
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CONST 
Zero = 30H; (* The Ordinal value of the Character '0' *) 
Seven = 37H; (* The Ordinal value of the Character '7! *) 
Quote = 47C; 


(*--= 


(* These objects defined in the DEFINITION MODULE *) 


TYPE 
OpMode = (DReg, (* Data Register *) 

ARDir, (* Address Register Direct *) 
ARInd, (* Address Register Indirect *) 
ARPost, (* Address Register with Post-Increment *) 
ARPre, (* Address Register with Pre-Decrement *) 
ARDisp, (* Address Register with Displacement *) 
ARDisxX, (* Address Register with Disp. & Index *) 
AbswW, (* Absolute Word (16-bit Address) *) 
AbsL, (* Absolute Word (32-bit Address) *) 
PCDisp, (* Program Counter Relative, with Displacement *) 
PCDisx, (* Program Counter Relative, with Disp. & Index *) 
Imm, (* Immediate *) 
MultiM, (* Multiple Register Move *) 
SR, (* Status Register *) 
GCR; (* Condition Code Register *) 
USP, (* User's Stack Pointer *) 
Null); (* Error Condition, or Operand missing *) 


Xtype = (X0, Dreg, Areg); 
SizeType = (SO, Byte, Word, $3, Long) ; 


OpConfig = RECORD (* OPERAND CONFIGURATION *) 
Mode : OpMode; 
Value : LONG; 


Loc : CARDINAL; (* Location of Operand on line *) 
Rn : CARDINAL; (* Register number *) 
Xn : CARDINAL; (* Index Reg. nbr. *) 
Xsize : SizeType; (* size of Index *) 
X : Xtype; (* Is index Data or Address reg? *) 
END; 
at | 
VAR 
AbsSize : SizeType; (* size of operand (Absolute only) *) 


PROCEDURE CalcValue (Operand : OPERAND; VAR Value : LONG); 
(* Calculates left and right values for GetValue *) 


VAR 
Neg : BOOLEAN; 
Dup : BOOLEAN; 
Num : CARDINAL; 
NumSyms : CARDINAL; 
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BEGIN 
IF Operand[0] = '-' 
Neg := TRUE; 
Operand[0] := 
ELSE 
Neg := 
END; 


FALSE; 


IF StrToCard (Operand, THEN 
(* It is a number *) 
CardToLong (Num, Value); 

IF Neg THEN 
LongSub 
END; 

ELSIF StringToLong (Operand, 

(* It is a HEX number *) 


Num) 


(LZero, Value, Value) ; 


Value) THEN 


ELSIF BinStrToLong (Operand, Value) THEN 
(* It is a Binary number *) 

ELSIF (Operand[0] = Quote) AND (Operand[2] 
CardToLong (ORD (Operand[1]), Value); 

ELSIF (Length (Operand) = 1) AND (Operand[0] = 
Value := AddrCnt; 

ELSE 


(* It is a label, 

IF NOT Pass2 THEN 
SortSymTab (NumSyms) ; 

END; 

IF NOT ReadSymTab (Operand, 
Error (SrcLoc, Undef); 

END; 

IF Dup THEN 
Error (SrcLoc, 

END; 

END; 
END CalcValue; 


but may be undefined! 


Value, 


Dup) 


SymDup) ; 


VAR Value 
HEX, 


PROCEDURE GetValue (Operand 
(* determines value of operand 


OPERAND; 
(in Decimal, 


VAR 
TempOp : 
TempVal 
op 
dig 3 
InQuotes : 


OPERAND; 
LONG; 
CHAR; 
CARDINAL; 
BOOLEAN; 


cr 


BEGIN 
i 2= 07 
Value := LZero; 
InQuotes := FALSE; 
Op: 2] Ake 
REPEAT 
2g = 
LOOP 
c := Operand[i]; 
TempOp[j] := c; 
IF c = Quote THEN 
InQuotes := NOT InQuotes; 
END; 
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= Quote) THEN 
'*') THEN 
a) 
THEN 
LONG) ; 


or via Symbol Table) 


*) 
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INC (i)? INC (35); 
IF c = OC THEN 


EXIT; 
END; 
IF (c = '+') AND (NOT InQuotes) THEN 
EXIT; 
END; 
IF (c = '-') AND (i > 1) AND (NOT InQuotes) THEN 
EXIT; 
END; 
END; 
TempOp[j - 1] := 0C; (* in case c is +/- *) 
CalcValue (TempOp, TempVal); 
IF op = '-' THEN 
LongSub (Value, TempVal, Value); 
ELSE 
LongAdd (Value, TempVal, Value); 
END; 
op := cG 


UNTIL op = OC; 
END GetValue; 


PROCEDURE GetSize (VAR Symbol : ARRAY OF CHAR; VAR Size : SizeType); 
(* determines size of opcode/operand: Byte, Word, Long *) 


VAR 
i: CARDINAL; 
: CHAR; 
BEGIN 
i := 0; 
REPEAT 
ec := Symbol [il]; 
INC (i)? 
UNTIL (c = 0C) OR (c = '.')? 


IF c = OC THEN 


Size := Word; (* Default to size Word = 16 bits *) 
ELSE 
c := Symbol[il; (* Record size extension *) 
Symbol{i - 1] := 0C; (* Chop size extension off *) 
IF (c = 'B') OR (c = 'S') THEN (* Byte or Short Branch/Jump *) 
Size := Byte; 
ELSIF c = 'L' THEN 
Size := Long; 
ELSE 
Size := Word; (* Default size *) 
END; 
END; 


END GetSize; 


PROCEDURE GetAbsSize (VAR Symbol : ARRAY OF CHAR; VAR AbsSize : SizeType); 
(* determines size of operand: Word or Long *) 
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VAR 
i: CARDINAL; 
c : CHAR; 
ParCnt : INTEGER; 


BEGIN 
ParCnt := 0; 
i t= 05 
REPEAT 
c := Symbol[il]; 
IF ¢ = '(" THEN 
INC (ParCnt); 
END; 
IF c = ')' THEN 
DEC (ParCnt); 
END; 
ING? (2)% 
UNTIL (c = OC) OR ((c = '.') AND 


IF c = OC THEN 


AbsSize := Long; 
ELSE 
c := Symbol[il]; (* Record size extension *) 
Symbol({i - 1] := 0C; (* Chop size extension off *) 
IF (c = 'W') OR (c = 'S') THEN 
AbsSize := Word; 
ELSE 
AbsSize := Long; 
END; 
END; 


END GetAbsSize; 


PROCEDURE GetInstModeSize (Mode : OpMode; 


VAR InstSize 


(* Determines the size for the various instruction modes. 


VAR 
n : CARDINAL; 


BEGIN 
CASE Mode OF 
ARDisp, 
ARDisX, 
PCDisp, 
PCDisxX, 
AbswW * HW S= 2 
| AbsL : on := 4; 
| MultiM : IF Pass2 THEN 
ns 
ELSE 
n := 2; 
END; 
| Imm : IF Size = Long THEN 
n := 4; 
ELSE 
n := 2; 


= 0; (* accounted for by code 


CARDINAL; 


*) 


generator *) 


END 


PROCED 


VAR 
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ELSE 
END; 
INC (InstSize, n); 


RETURN (n * 2); 
Get InstModeSize; 


URE GetOperand (Oper : OPERAND; VAR Op : OpConfig); 
(* Finds mode and value for source or destination operand *) 


ch : CHAR; 


C : CARDINAL; (* holds the ordinal value of a charcter *) 


i, j : CARDINAL; 


Len : CARDINAL; (* Calculated Length of Oper *) 


TempOp : OPERAND; 
MultFlag : BOOLEAN; 


BEGIN 


Op.Mode := Null; Op.X := X0; 
Len := Length (Oper); 


IF Len = 0 THEN 
RETURN; 
END; 


GetAbsSize (Oper, AbsSize); 


IF Oper[0] = '#' THEN (* Immediate *) 
IF Pass2 THEN 
i. 35 0% 
REPEAT 
INC (i); 
Oper[i - 1] := Oper[i]; 


UNTIL Oper[i] = 0C; 
GetValue (Oper, Op.Value); 
END; 
Op.Mode := Imm; 
RETURN; 
END; 


IF Len = 2 THEN (* possible Addr or Data Register *) 


C := ORD (Oper[1]); 


IF (Oper[0] = 'S') AND (Oper[1] = 'R') THEN 
(* Status Register *) 
Op.Mode := SR; 
RETURN; 
ELSIF (Oper[0] = 'S') AND (Oper[1] = 'P') THEN 


(* Stack Pointer *) 
Op.Mode := ARDir; 
Op.Rn := 7; 
RETURN; 
ELSIF (C >= Zero) AND (C <= Seven) THEN 
(* Looks Like an Addr or Data Reg *) 


IF Oper[0] = 'A' THEN (* Address Register *) 
Op.Mode := ARDir; 
Op.Rn := C - Zero; 


RETURN; 
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ELSIF Oper[0] = 'D' THEN (* Data Register *) 
Op.Mode := DReg; 
Op.Rn := C - Zero; 
RETURN; 
ELSE 
(* may be a label -- ignore for now *) 
END; 
ELSE 
(* may be a label -- ignore for now *) 
END; 


END; 


IF Len = 3 THEN 


IF (Oper{0] = 'C') AND (Oper[1] = 'C') AND (Oper[2] = 'R') THEN 
(* Condition Code Register *) 
Op.Mode := CCR; 
RETURN; 
ELSIF (Oper{0] = 'U') AND (Oper[1] = 'S') AND (Oper[2] = 'P') THEN 
(* User's Stack Pointer *) 
Op.Mode := USP; 
RETURN; 
ELSE 
(* may be a label -- ignore for now *) 
END; 
END; 
IF (Len = 4) AND (Oper{0] = '(') AND (Oper[3} = ')') THEN 
IF (Oper[1] = 'S') AND (Oper[2] = 'P') THEN 
Op.Mode := ARInd; 
Op.Rn := 7; 
RETURN; 
ELSIF Oper[{1]) = 'A' THEN 


C := ORD (Oper[2]); 
IF (C >= Zero) AND (C <= Seven) THEN 
Op.Mode := ARInd; 
Op.Rn := C - Zero; 
RETURN; 
ELSE 
Error (Op.Loc, SizeErr); 
RETURN; 
END; 
ELSE 
Error (Op.Loc, AddrErr); 
RETURN; 
END; 
END; 


IF (Len = 5) AND (Oper[0] = '(') 


AND (Oper[{3] = ')') AND (Oper[4] = 't+') THEN 
(* Address Indirect with Post Inc *) 
IF (Oper{1] = 'S') AND (Oper[{2] = 'P') THEN 
(* System Stack Pointer *) 
Op.Mode := ARPost; 
Op.Rn := 7; 
RETURN 
ELSIF Oper[1] = 'A' THEN 


C := ORD (Oper[2]); 

IF (C >= Zero) AND (C <= Seven) THEN 
Op.Mode := ARPost; 
Op.Rn := C - Zero; 
RETURN; 
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ELSE 
Error (Op.Loc, SizeErr); 
RETURN; 
END; 
ELSE 
Error (Op.Loc, AddrErr); 
RETURN; 
END; 
END; 
IF (Len = 5) AND (Oper[0] = '-') 
AND (Oper[{1] = '(') AND (Oper[4] = ')') THEN 
IF (Oper[2] = 'S') AND (Oper[3] = 'P') THEN 
(* System Stack Pointer *) 
Op.Mode := ARPre; 
Op.Rn := 7; 
RETURN; 
ELSIF Oper[{2] = 'A' THEN 
C := ORD (Oper[3])? 
IF (C >= Zero) AND (C <= Seven) THEN 
Op.Mode := ARPre; 
Op.Rn := C - Zero; 
RETURN; 
ELSE 
Error (Op.Loc, SizeErr); 
RETURN; 
END; 
ELSE 
Error (Op.Loc, AddrErr); 
RETURN; 
END; 
END; 


(* Try to split off displacement (if present) *) 


i:= 0; 
ch := Oper[il]; 
WHILE (ch # '(') AND (ch # OC) DO (* move to TempOp *) 
TempOp[i] := ch; 
INC (i); 
ch := Oper[il]; 
END; 
TempOp[i] := OC; (* Displacement (it it exists) now in TempOp *) 
IF (ch = '(') AND (TempOp[i - 1] # '+') THEN 


(* looks like a displacement mode *) 
IF Pass2 THEN 
GetValue (TempOp, Op.Value) ; (* Value of Disp. *) 


END; 

3 := 0; 

REPEAT (* put rest of operand (eg. (An,Xi) in TempOp *) 
ch := Oper[il]; 
TempOp[j] := ch; 


INC (i); INC (4); 

UNTIL ch = OC; 

IF Length (TempOp) > 4 THEN (* Index may be present *) 
i:=4; (* Index starts at 4 *) 

j t= 0: 
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REPEAT (* put Xi in Oper *) 
ch := TempOp[il]; 
Oper[j3] := ch; 


INC (i); INC (j)¢ 
UNTIL ch = 0C; 


IF Oper[j - 2] = ')' THEN 
Oper[j - 2] := 0C; 

ELSE 
Error (Op.Loc, AddrErr); 
RETURN; 

END; 


GetSize (Oper, Op.Xsize); 

IF Op.Xsize = Byte THEN 
Error (Op.Loc, SizeErr); 
RETURN; 

END; 


C := ORD (Oper[1]); 
IF (Oper[0] = 'S') AND (Oper[1] = 'P') THEN 
(* Stack Pointer *) 
Op.X := Areg; 
Op.Xn := 7; 
ELSIF Oper[0] = 'A' THEN 
IF (C >= Zero) AND (C <= Seven) THEN 


Error (Op.Loc, SizeErr) ; 
RETURN; 
END; 
ELSIF Oper[0] = 'D' THEN 
IF (C >= Zero) AND (C <= Seven) THEN 
Op.X := Dreg; 
Op.Xn := C - Zero; 
ELSE 
Error (Op.Loc, SizeErr); 
RETURN; 
END; 
ELSE 
Error (Op.Loc, AddrErr); 
RETURN; 
END; 


IF (TempOp[1] = 'P') AND (TempOp(2] = 'C') THEN 
Op.Mode :=PCDisX; 
RETURN; 
ELSIF (TempOp[1] = 'S') AND (TempOp[2] = 'P') THEN 
(* Stack Pointer *) 
Op.Rn := 7; 
Op.Mode := ARDisX; 
RETURN; 
ELSIF TempOp[1] = 'A' THEN 
C := ORD (TempOp[2]); 
IF (C >= Zero) AND (C <= Seven) THEN 
Op.Rn := C - Zero; 
Op.Mode := ARDisX; 
RETURN; 
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ELSE 
Error (Op.Loc, SizeErr); 
RETURN; 
END; 
ELSE 
Error (Op.Loc, AddrErr) ; 
RETURN; 
END; 
ELSE (* No Index *) 
IF (TempOp[1] = 'P') AND (TempOp[2] = 'C') THEN 
Op.Mode := PCDisp; 
RETURN; 
ELSIF (TempOp[1] = 'S') AND (TempOp[2] = 'P') THEN 


(* Stack Pointer *) 
Op.Mode := ARDisp; 
Op.Rn := 7; 
RETURN; 
ELSIF TempOp[1] = 'A' THEN 
C := ORD (TempOp[2]); 
IF (C >= Zero) AND (C <= Seven) THEN 
Op.Rn := C - Zero; 
Op.Mode := ARDisp; 
RETURN; 
ELSE 
Error (Op.Loc, SizeErr) ; 
RETURN; 
END; 
ELSE 
Error (Op.Loc, AddrErr); 
RETURN; 
END; 
END; 
END; 


(* Check to see if this could be a register list for MOVEM: *) 
i:= 0; 
MultFlag := FALSE; 
LOOP 
ch := Oper[i]; INC (i)? 
IF ch = OC THEN 
MultFlag := FALSE; 


EXIT; 
END; 
IF (ch = 'A') OR (ch = 'D') THEN 
ch := Oper[i]; ING (i) C := ORD (ch); 


IF ch = OC THEN 
MultFlag := FALSE; 
EXIT; 
END; 
IF (C >= Zero) AND (C <= Seven) THEN 
ch := Oper[il]; INC (i); 
IF ch = OC THEN 
EXIT 
END; 
IF (ch = '/') OR (ch = '-') THEN 
MultFlag := TRUE; 
END; 
ELSE 
MultFlag := FALSE; 
EXIT; 
END; 
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ELSE 
MultFlag := FALSE; 
EXIT; 

END; 

END; 

IF MultFlag THEN 
Op.Mode := MultiM; 
RETURN; 

END; 


(* Must be absolute mode! *) 
IF Pass2 THEN 
GetValue (Oper, Op.Value) ; 


END; 
IF AbsSize = Word THEN 
Op.Mode := AbsW; 
ELSE 
Op.Mode := AbsL; 
END; 


END GetOperand; 


PROCEDURE GetMultReg (Oper : OPERAND; PreDec : BOOLEAN; 
Loc : CARDINAL; VAR MultExt : BITSET); 
(* Builds a BITSET marking each register used in a MOVEM instruction *) 


TYPE 
MReg = (DO, Dl, D2, D3, D4, D5, D6, D7, 
AO, Al, A2, A3, A4, AS, A6, AT); 


VAR 
i, j : CARDINAL; 
ch : CHAR; 
C : CARDINAL; (* ORD value of ch *) 
Tl, T2 : MReg; (* Temporary variables for registers *) 
RegStack : ARRAY [0..15] OF MReg; (* Holds specified registers *) 
SP : CARDINAL; (* Pointer for Register Stack *) 


RegType : (D, A, Nil); 
Range : BOOLEAN; 


BEGIN 
SP := 0; 
Range := FALSE; 
RegType := Nil; 
i:= 0; 


ch := Oper[il]; 
WHILE ch # OC DO 
IF SP > 15 THEN 
Error (Loc, SizeErr); 
RETURN; 
END; 


C := ORD (ch); 
IF ch = 'A' THEN 
IF RegType = Nil THEN 
RegType := A; 
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ELSE 
Error (Loc, OperErr) ; 
RETURN; 
END; 
ELSIF ch = 'D' THEN 
IF RegType = Nil THEN 
RegType := D; 
ELSE 
Error (Loc, OperErr); 
RETURN; 
END; 
ELSIF (C >= Zero) AND (C <= Seven) THEN 
IF RegType # Nil THEN 
T2 := VAL (MReg, (ORD (RegType) * 8) + (C - Zero)); 
IF Range THEN 
Range := FALSE; 
Tl := RegStack[SP - 1]; (* retreive lst Reg in range *) 
FOR j := (ORD (T1) + 1) TO ORD {T2) DO 
RegStack[SP] := VAL (MReg, j); 
INC (SP); 
END; 
ELSE 
RegStack[SP] := T2; 
INC (SP); 
END; 
ELSE 
Error (Loc, OperErr) ; 
RETURN; 
END; 
ELSIF ch = '-' THEN 
IF (Range = FALSE) AND (RegType # Nil) AND (i > 0) THEN 
RegType := Nil; 
Range := TRUE; 
ELSE 
Error (Loc, OperErr) ; 
RETURN; 
END; 
ELSIF ch = '/' THEN 
IF (Range = FALSE) AND (RegType # Nil) AND (i > 0) THEN 
RegType := Nil; 
ELSE 
Error (Loc, OperErr) ; 
RETURN; 
END; 
ELSE 
Error (Loc, OperErr) ; 
RETURN; 
END; 


INC (i); 
ch := Oper[i]; 
END; 
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MultExt := {}; 
FOR 3 := 0 TO SP - 1 DO 
C := ORD (RegStack[j]); 
IF PreDec THEN 
Gs 15 - Cy 
END; 
INCL (MultExt, C); 
END; 


END GetMultReg; 


END SyntaxAnalyzer. 


Listing 10.9 


IMPLEMENTATION MODULE CodeGenerator; 
(* Uses information supplied by Parser, OperationCodes, *) 
(* and SyntaxAnalyzer to produce the object code. *) 


FROM Strings IMPORT 
Length, CompareStr; 


FROM SymbolTable IMPORT 
FillSymTab, ReadSymTab; 


FROM Parser IMPORT 
TOKEN, OPERAND, OpLoc, SrcLoc, DestLoc; 


FROM LongNumbers IMPORT 
LONG, LongAdd, LongSub, LongInc, LongDec, 
LongClear, CardToLong, LongToCard, LongToInt, 
LongCompare, AddrBoundW, AddrBoundL; 


FROM OperationCodes IMPORT 
ModeTypeA, ModeTypeB, ModeA, ModeB, Instructions; 


FROM ErrorX68 IMPORT 
ErrorType, Error; 


FROM SyntaxAnalyzer IMPORT 
SizeType, OpConfig, OpMode, Xtype, 
GetValue, GetSize, GetInstModeSize, GetOperand, GetMultReg; 


CONST 
UMP = {14, 11, 10, 9, 7, 6}% 
JSR = {14, 11, 10, 9, 7h? 
RTE = {14, 11, 10, 9, 6, 5, 4, 1, O}F 
RTR = {14, 11, 10, 9, 6, 5, 4, 2, 1, O}F 
RTS = {14, 11, 10, 9, 6, 5, 4, 2, O}F 


TRAPV = {14, 11, 10, 9, 6, 5, 4, 2, 1}; 
STOP (ia, li, 16, 3) 6, Sy. 4a 1 
LINK {14, 11, 10, 9, 6, 4}; 

SWAP = {14, 11, 6}; 

UNLK = {14, 11, 10, 9, 6, 4, 3}3% 

Quote = 47C; 


i 
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VAR 
(*--- 
(* Defined in DEFINITION MODULE *) 
LZero, AddrCnt : LONG; 
Pass2 : BOOLEAN; 
-=—-*) 
AddrAdv : LONG; 
TempL : LONG; (* Temporary variables *) 
TempI : INTEGER; 
TempC : CARDINAL; 


BrValue : LONG; (* Used to calculate relative branches *) 

RevBr : BOOLEAN; 

Size : SizeType; (* size for OpCode *) 

InstSize : CARDINAL; 

AddrModeA : ModeA; (* Addressing modes for this instruction ®) 
AddrModeB : ModeB; (* ditto *) 
Op : BITSET; (* Raw bit pattern for OpCode *) 


Src, Dest : OpConfig; 


PROCEDURE BuildSymTable (VAR AddrCnt : LONG; 
Label, OpCode : TOKEN; SrcOp, DestOp : OPERAND) ; 
(* Builds symbol table from symbolic information of Source File *) 


VAR 
Value : LONG; 
Full : BOOLEAN; 
PseudoOp : BOOLEAN; 


BEGIN 
Value := LZero; 
AddrAdv := LZero; 
InstSize := 0; 
PseudoOp := FALSE; 
Size := SO; 
IF Length (OpCode) = 0 THEN 
RETURN; (* Nothing added to symbol table, AddrCnt not changed *) 
END; 


GetSize (OpCode, Size); 


IF CompareStr (OpCode, "ORG") = 0 THEN 
GetValue (SrcOp, AddrCnt); 
AddrBoundW (AddrCnt) ; 
Value := AddrCnt; 
PseudoOp := TRUE; 

ELSIF CompareStr (OpCode, "EQU") = 0 THEN 
GetValue (SrcOp, Value); 


PseudoOp := TRUE; 
ELSIF CompareStr (OpCode, "DC") = 0 THEN 
CASE Size OF 
Word : AddrBoundW (AddrCnt); 
| Long +: AddrBoundL (AddrCnt); 
| Byte 5 Hi 


END; 
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IF SrcOp[0] = Quote THEN (* String Constant *) 
TempC := Length (SrcOp); 
IF TempC > 2 THEN 
InstSize := TempC - 2; 
END; 
ELSE 
InstSize := ORD (Size); 
END; 
CardToLong (InstSize, AddrAdv); 
Value := AddrCnt; 
PseudoOp := TRUE; 
ELSIF CompareStr (OpCode, "DS") = 0 THEN 
GetValue (SrcOp, AddrAdv); 
Value := AddrCnt; 
PseudoOp := TRUE; 
ELSIF CompareStr (OpCode, “EVEN") = 0 THEN 
AddrBoundW (AddrCnt); 
Value := AddrCnt; 
PseudoOp := TRUE; 
ELSIF CompareStr (OpCode, "END") = 0 THEN 
PseudoOp := TRUE; 
ELSE 
Value := AddrCnt; 
END; 


IF Length (Label) # 0 THEN 
FillSymTab (Label, Value, Full); 
IF Full THEN 
Error (0, SymFull); 
END; 
END; 


IF NOT PseudoOp THEN 
Instructions (OpCode, OpLoc, Op, AddrModeA, AddrModeB) ; 


AddrBoundW (AddrCnt); 

Src.Loc := SrcLoc; Dest.Loc := DestLoc; 
GetOperand (SrcOp, Src); 

GetOperand (DestOp, Dest); 

InstSize := 2; (* minimum size of instruction *) 


IF Brnch IN AddrModeA THEN 
IF Size # Byte THEN 
INC (InstSize, 2); 
END; 
ELSIF DecBr IN AddrModeA THEN 
INC (InstSize, 2); 
ELSE 
IF (Op = JMP) OR (Op = JSR) THEN (* Allows for 'JMP.S' *) 
IF (Size = Byte) AND (Src.Mode = AbsL) THEN 


Src.Mode := AbsW; 
END; 
END; 
TempC := GetInstModeSize (Src.Mode, Size, InstSize) ; 
TempC := GetInstModeSize (Dest .Mode, Size, InstSize); 
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IF (Src.Mode = Imm) AND 


((Data911 IN AddrModeA) OR (Data03 IN AddrModeA) OR 


(Data07? IN AddrModeA) OR (CntR911 IN AddrModeA) ) 
(* Quick instruction *) 
InstSize := 2; 
END; 
CardToLong (InstSize, AddrAdv) ; 
END; 
END BuildSymTable; 
PROCEDURE MergeModes (VAR SrcOp, DestOp OPERAND; 
VAR ObjOp, ObjSrc, Ob jDest LONG; 
VAR nO, ns, nD : CARDINAL) 


(* 
(* 
(* 


(* 


Uses information from Instructions & GetOperand (among others) 


to complete calculation of Object Code. 


Op, AddrModeA, 


AddrModeB, 


Size, 


and Src & Dest records are all 


THEN 


’ 


Global variables imported from the SyntaxAnalyzer MODULE. 


CONST 


(* BITSETs of the modes MISSING from effective address modes 
Effective addressing - all modes 


ea = {}7 (* 
dea = {1}; C 
mea = {1, 0}; (* 
cea = {11, 4, 3, 1, O}; i 
aea = {11, 10, 9}; (* 
xxx = {15, 14, 13}; (* 
(* 2 "AND" masks to turn off 
Off910 = (15, 14, 13; 12, 11, 
Off34 = 


VAR 
M : CARDINAL; 
i: CARDINAL; 


Ext BITSET; 
ExtL LONG; 
Quick : BOOLEAN; 


PROCEDURE OperExt 


VAR 
GoodInt BOOLEAN; 
Xext BITSET; 

BEGIN 
GoodInt := LongToInt 


CASE EA.Mode OF 


AbsL a2 
| AbsW $ 
Error 
END; 
| ARDisp, 
PCDisp 


END; 


(VAR EA 
(* Calculate Operand Extension word, 


Data effective addressing 
Memory effective addressi 
Control effective address 


Alterable effective addressing 


extra modes: CCR/SR/USP 


switch bits for shift/rotate *) 


8, 7, 6, 5, 4, 3, 2, 1, 


OpConfig); 


(EA.Value, 


TempI) ; 


(* No range checking needed *) 
IF NOT GoodInt THEN 
(EA.Loc, 


SizeErr); 


IF NOT GoodInt THEN 
Error (EA.Loc, 


SizeErr); 


ng 
ing 


O};F 


{15, 14, 13, 12, 11, 10, 5, 6, 3%. 6, 5, 2, dy O}F 


*)) 
*) 
“A 


=) 


*) 
*) 
™) 
*) 
*) 
1) 


*) 


and check range of Operands *) 
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| ARDisX, 
PCDisX : IF (TempI < -128) OR (TempI > 127) THEN 
Error (EA.Loc, SizeErr); 
END; 
Xext := BITSET (EA.Xn * 4096); 
IF EA.X = Areg THEN 
Xext := Xext + {15}; 
END; 
IF EA.Xsize = Long THEN 
Xext := Xext + {11}; 
END; 
CardToLong (CARDINAL (Xext), TempL); 
EA.Value[3] := TempL[3]; 
EA.Value[4] := TempL[4]; 
{| Imm : IF Size = Long THEN 
(* No range check needed *) 
ELSE 
IF GoodInt THEN 
IF Size = Byte THEN 
IF (TempI < -128) OR (TempI > 127) THEN 
Error (EA.Loc, SizeErr); 
END; 
END; 
ELSE 
Error (EA.Loc, SizeErr); 
END; 
END; 
ELSE 
(* No Action *) 
END; 


END OperExt; 


PROCEDURE EffAdr (VAR EA : OpConfig; Bad : BITSET); 
(* adds effective address field to Op (BITSET representing opcode) *) 


BEGIN 
M := ORD (EA.Mode) ; 


IF M IN Bad THEN 
Error (EA.Loc, ModeErr); 
RETURN; 
ELSIF M > 11 THEN 
RETURN; 
ELSIF M < 7 THEN 
Op := Op + BITSET (M * 8) + BITSET (EA.Rn); 
ELSE (* 7 <= M <= 11 *) 
Op := Op + {5, 4, 3} + BITSET (M - 7); 
END; 


OperExt (EA); 


END EffAdr; 
BEGIN (* MergeModes *) 
ExtL := LZero; 


Quick := FALSE; 


(* Check for 5 special cases first *) 
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IF (Op = RTE) OR (Op = RTR) OR (Op = RTS) OR (Op = TRAPV) 
IF Srce.Mode # Null THEN 
Error (SrcLoc, OperErr) ; 
END; 
END; 


IF Op = STOP THEN 
IF (Src.Mode # Imm) OR (Dest.Mode # Null) THEN 
Error (SrcLoc, OperErr); 
END; 
END; 


IF Op = LINK THEN 
Op := Op + BITSET (Src.Rn); 
IF (Src.Mode # ARDir) OR (Dest.Mode # Imm) THEN 
Error (SrcLoc, ModeErr) ; 
END; 
END; 


IF Op = SWAP THEN 
IF EAOSf IN AddrModeB THEN 
(* Ignore, this is PEA instruction! *) 
ELSE 
Op := Op + BITSET (Srce.Rn); 
IF (Src.Mode # DReg) OR (Dest.Mode # Null) THEN 
Error (SrcLoc, OperErr); 
END; 
END; 
END; 


IF Op = UNLK THEN 
Op := Op + BITSET (Src.Rn); 
IF (Src.Mode # ARDir) OR (Dest.Mode # Null) THEN 
Error (SrcLoc, OperErr); 
END; 
END; 


(* Now do generalized address modes *) 


IF (Ry02 IN AddrModeA) AND (Rx911 IN AddrModeA) THEN 
Op := Op + BITSET (Src.Rn) + BITSET (Dest.Rn * 512); 
(* Now do some error checking! *) 
IF RegMem3 IN AddrModeA THEN 

IF Src.Mode = DReg THEN 
IF Dest .Mode # DReg THEN 
Error (DestLoc, ModeErr); 
END; 
ELSIF Src.Mode = ARPre THEN 
Op := Op + {3}; 
IF Dest.Mode # ARPre THEN 
Error (DestLoc, ModeErr) ; 
END; 
ELSE 
Error (SrcLoc, OperErr); 
END; 
ELSE 
IF Src.Mode = ARPost THEN 
IF Dest.Mode # ARPost THEN 
Error (DestLoc, ModeErr); 
END; 


THEN 
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ELSE 
Error (SrcLoc, OperErr); 
END; 
END; 
END; 


IF Data911 IN AddrModeA THEN 
Quick := TRUE; 
IF Src.Mode = Imm THEN 
IF LongToInt (Srce.Value, TempI) 
AND (TempI > 0) 
AND (TempI <= 8) THEN 
IF TempI < 8 THEN (* Data of 8 is coded as 000 *) 
Op := Op + BITSET (TempI * 512); 
END; 
ELSE 
Error (SrcLoc, SizeErr); 
END; 
ELSE 
Error (SrcLoc, OperErr); 
END; 
END; 


IF CntR911 IN AddrModeA THEN 
(* Only Shift/Rotate use this *) 
IF Dest .Mode = DReg THEN 
Op := (Op * Off910) + BITSET (Dest.Rn); 
CASE Size OF 
Byte : 7 
| Word : Op := Op + {6}; 
| Long : Op := Op + {7}; 
END; 
IF Src.Mode = DReg THEN 
Op := Op + {5} + BITSET (Srce.Rn * 512); 
ELSIF Src.Mode = Imm THEN 
Quick := TRUE; 
(* Range Check *) 
IF LongToInt (Src.Value, TempI) 
AND (TempI > 0) 
AND (TempI <= 8) THEN 
IF TempI < 8 THEN (* Data of 8 is coded as 000 *) 
Op := Op + BITSET (TempI * 512); 


END; 
ELSE 
Error (SrcLoc, SizeErr); 
END; 
ELSE 
Error (SrcLoc, OperErr); 
END; 
ELSIF Dest.Mode = Null THEN 
Op := (Op * Off34) + {7, 6}; 
EffAdr (Src, (mea + aea)); 
ELSE 
Error (SrcLoc, OperErr); 
END; 


END; 


IF Data03 IN AddrModeA THEN 
Quick := TRUE; 
IF Src.Mode = Imm THEN 
IF LongToInt (Srce.Value, Temp!) 
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AND (TempI >= 0) 
AND (TempI < 16) THEN 
Op := Op + BITSET (TempI); 


ELSE 
Error (SrcLoc, SizeErr) ; 
END; 
ELSE 
Error (SrcLoc, OperErr) ; 
END; 
END; 


IF Data07 IN AddrModeA THEN 

Quick := TRUE; 

IF (Src.Mode = Imm) AND (Dest .Mode 

IF LongToInt (Src.Value, TempI) 

AND (TempI >= -128) 
AND (TempI <= 127) THEN 

Op := Op + (BITSET (TempI) * {7, 6, 5, 4, 3, 2, 1, O}) 

+ BITSET (Dest.Rn * 512); 


i] 


DReg) THEN 


ELSE 
Error (SrcLoc, SizeErr); 
END; 
ELSE 
Error (SrcLoc, OperErr); 
END; 


END; 


IF OpM68D IN AddrModeA THEN 

IF Dest.Mode = DReg THEN 
Op := Op + BITSET (Dest.Rn * 512); 
IF (Src.Mode = ARDir) AND (Size = Byte) THEN 

Error (SrcLoc, SizeErr); 

END; 

ELSE (* Assume Src.Mode = DReg -- Error trapped elsewhere *) 
Op := Op + BITSET (Src.Rn * 512); 
Op := Op + {8}; 

END; 


CASE Size OF 


Byte 3 7 
| Word : Op := Op + {6}; 
| Long : Op := Op + {7}; 
END; 


END; 


IF OpM68A IN AddrModeA THEN 
IF Dest .Mode = ARDir THEN 
Op := Op + BITSET (Dest.Rn * 512); 
ELSE 
Error (DestLoc, ModeErr) ; 
END; 


CASE Size OF 
Byte : Error (OpLoc, SizeErr); 
| Word : Op := Op + {7, 6}37 
| Long : Op := Op + {8, 7, 6}; 
END; 
END; 
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IF OpM68C IN AddrModeA THEN 
IF Dest.Mode = DReg THEN 


Op := Op + BITSET (Dest.Rn * 512); 
ELSE 

Error (DestLoc, ModeErr) ; 
END; 


CASE Size OF 
Byte : IF Src.Mode = ARDir THEN 
Error (OpLoc, SizeErr); 


END; 
| Word : Op := Op + {6}; 
| Long : Op := Op + {7}; 
END; 
END; 


IF OpM68X IN AddrModeA THEN 
IF Src.Mode = DReg THEN 
Op := Op + BITSET (Src.Rn * 512); 
ELSE 
Error (SrcLoc, ModeErr) ; 
END; 


CASE Size OF 
Byte : Op := Op + {8}; 


| Word : Op := Op + {8, 6}; 
| Long : Op := Op + {8, 7}; 
END; 


END; 


IF OpM68S IN AddrModeA THEN 
IF Src.Mode = DReg THEN 
Op := Op + BITSET (Src.Rn); 
ELSE 
Error (SrcLoc, ModeErr) ; 
END; 


CASE Size OF 
Byte : Error (OpLoc, SizeErr); 


| Word : Op := Op + {7}; 
}) Long + Op := Op + {7, 6}e 
END; 


END; 


IF OpM68R IN AddrModeA THEN 
IF (Src.Mode = DReg) AND (Dest.Mode = ARDisp) THEN 
CASE Size OF 
Byte : Error (OpLoc, SizeErr); 
| Word : Op := Op + {8, 7}; 
| Long : Op := Op + {8, 7, 6}; 
END; 
Op := Op + BITSET (Srce.Rn * 512) + BITSET (Dest.Rn); 
ELSIF (Src.Mode = ARDisp) AND (Dest.Mode = DReg) THEN 
CASE Size OF 
Byte : Error (OpLoc, SizeErr); 
| Word : Op := Op + {8}; 
| Long : Op := Op + {8, 6}; 
END; 
Op := Op + BITSET (Src.Rn) + BITSET (Dest.Rn * 512); 
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ELSE 
Error (SrcLoc, ModeErr) ; 
END; 
END; 


IF OpM37 IN AddrModeA THEN 

IF (Src.Mode = DReg) AND (Dest.Mode = DReg) THEN 

Op := Op + {6} + BITSET (Src.Rn * 512) + BITSET (Dest.Rn); 
ELSIF (Src.Mode = ARDir) AND (Dest.Mode = ARDir) THEN 

Op := Op + {6, 3} + BITSET (Src.Rn * 512) + BITSET (Dest.Rn); 
ELSIF (Src.Mode = ARDir) AND (Dest.Mode = DReg) THEN 

Op := Op + {7, 3} + BITSET (Dest.Rn * 512) + BITSET (Src.Rn); 
ELSIF (Src.Mode = DReg) AND (Dest.Mode = ARDir) THEN 


Op := Op + {7, 3} + BITSET (Src.Rn * 512) + BITSET (Dest.Rn); 
ELSE 

Error (SrcLoc, ModeErr); 
END; 


END; 


IF Bit811 IN AddrModeB THEN 
IF Src.Mode = DReg THEN 
Op := Op + {8} + BITSET (Src.Rn * 512); 
ELSIF Src.Mode = Imm THEN 
Op := Op + {11}; 
ELSE 
Error (SrcLoc, ModeErr) ; 
END; 
END; 


IF Size6é7 IN AddrModeB THEN 
CASE Size OF 
Byte : ;(* No action -- bits already 0's *) 
| Word : Op := Op + {6}; 
| Long : Op := Op + {7}; 
END; 
END; 


IF Size6é IN AddrModeB THEN 
CASE Size OF 
Byte : Error (OpLoc, SizeErr); 


| Word : (* No Action -- BIT is already 0 *) 
| Long : Op := Op + {6}; 
END; 


END; 


IF Sizel213A IN AddrModeB THEN 
CASE Size OF 
Byte : Op := Op + {12}; 
| Word : Op := Op + {13, 12}; 
| Long : Op := Op + {13}; 
END; 
END; 


IF Sizel213 IN AddrModeB THEN 
Op := Op + BITSET (Dest.Rn * 512); 
CASE Size OF 
Byte : Error (OpLoc, SizeErr); 
| Word : Op := Op + {13, 12}; 
| Long : Op := Op + {13}; 
END; 
END; 
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IF EAQ5a IN AddrModeB THEN 
IF (Dest.Mode = DReg) OR (Dest.Mode = ARDir) THEN 
EffAdr (Src, ea); 
ELSE 
Error (DestLoc, ModeErr) ; 
END; 
END; 


IF EAQ5b IN AddrModeB THEN 
IF Dest.Mode = DReg THEN 
EffAdr (Src, dea); 


Op := Op + BITSET (Dest.Rn * 512); 
ELSE 

Error (DestLoc, ModeErr); 
END; 


END; 


IF EAQ5c IN AddrModeB THEN 
EffAdr (Dest, {11, 1}); 
END; 


IF EAOSd IN AddrModeB THEN 
EffAdr (Dest, aea); 
IF (Dest.Mode = ARDir) AND (Size = Byte) THEN 
Error (OpLoc, SizeErr); 
END; 
END; 


IF EAQ5e IN AddrModeB THEN 
IF Dest .Mode = Null THEN 
EffAdr (Src, (dea + aea)); 
ELSIF (Src.Mode = Imm) OR (Src.Mode = DReg) THEN 
EffAdr (Dest, (dea + aea)); 


ELSE 
Error (SrcLoc, ModeErr); 
END; 
END; 
IF EAOS£ IN AddrModeB THEN (* LEA & PEA / JMP & JSR *) 


EffAdr (Src, cea); 
IF Rx911 IN AddrModeA THEN 
IF Dest.Mode = ARDir THEN 
Op := Op + BITSET (Dest.Rn * 512); 
ELSE 
Error (DestLoc, ModeErr); 
END; 
ELSE 
IF Dest.Mode # Null THEN 
Error (DestLoc, OperErr); 
END; 
END; 
END; 


IF EAO5Sx IN AddrModeB THEN 
IF Dest .Mode = DReg THEN 
EffAdr (Src, dea); 
ELSIF Src.Mode = DReg THEN 
EffAdr (Dest, mea + aea); 
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ELSE 
Error (SrcLoc, OperErr) ; 
END; 
END; 


IF EAOSy IN AddrModeB THEN 
IF Dest.Mode = DReg THEN 
EffAdr (Src, ea); 
IF (Src.Mode = ARDir) AND (Size = Byte) THEN 
Error (OpLoc, SizeErr); 
END; 
ELSIF Src.Mode = DReg THEN 
EffAdr (Dest, (mea + aea)); 
ELSE 
Error (SrcLoc, ModeErr) ; 
END; 
END; 


IF EAQ5z IN AddrModeB THEN 
IF Src.Mode = MultiM THEN 
EffAdr (Dest, (mea + aea + {3})); 


GetMultReg (SrcOp, (Dest.Mode = ARPre), SrcLoc, 


ELSIF Dest .Mode = MultiM THEN 
EffAdr (Src, (mea + {11, 4})); 


GetMultReg (DestOp, (Srce.Mode = ARPre), DestLoc, 


Op := Op + {10}; (* set direction *) 
ELSE 

Error (SrcLoc, OperErr); 
END; 


INC (nO, 4); (* extension is part of OpCode *) 
INC (InstSize, 2); 
CardToLong (CARDINAL (Ext), ExtL); 

END; 


IF EA611 IN AddrModeB THEN 

IF Dest .Mode = CCR THEN 
Op := (14, 10, 7, 6}; 
EffAdr (Src, dea); 

ELSIF Dest.Mode = SR THEN 
Op := {14, 10, 9, 7, 6}; 
EffAdr (Src, dea); 

ELSIF Src.Mode = SR THEN 
Op := {14, 7, 6}; 

EffAdr (Dest, dea + aea); 

ELSIF Dest.Mode = USP THEN 
Op := {14, 11, 10, 9, 6, 5}; 
IF Src.Mode = ARDir THEN 


Op := Op + BITSET (Src.Rn); 
ELSE 
Error (SrcLoc, ModeErr) ; 


END; 
ELSIF Src.Mode = USP THEN 

Op := {14, 11, 10, 9, 6, 5, 3}; 
IF Dest.Mode = ARDir THEN 

Op := Op + BITSET (Dest.Rn); 
ELSE 

Error (DestLoc, ModeErr) ; 
END; 


Ext); 


Ext); 
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ELSE 


EffAdr 
IF 


(Src, 
(Size Byte) 
Error (SrcLoc, 
END; 


AND 


M 
LE 


ORD 
(M IN 
Error (DestLoc, 
ELSIF M < 7 THEN 

Op Op + BITSET 
ELSE 7 <= M 

Op Op + {8, 7, 
END; 


(Dest .Mode) ; 
(dea + aea)) 


OperExt (Dest); 


END; 


END; 


, 


< 
6} + BITSET ((M - 7) 


(ea + xxx))7 


(Src.Mode 


SizeErr); 


OR (M > 11) 


ModeErr) ; 


(M * 64) 
LL 


*) 


ARDir) 


THEN 


THEN 


#5 512) 


IF (Dest .Mode AND 


CCR) 


(Src.Mode Imm) THEN 


IF (Size67 IN AddrModeB) 


END 


IF 


END; 


Car 
INC 
INC 
IF 


END 


nS 
Obj 
nD 
Obj 


AND (EA0S5e IN AddrModeB) 


AND (Exten IN AddrModeB) THEN 


IF 10 IN Op THEN (* NOT ANDI/EORI/ORI *) 
Error (DestLoc, ModeErr) ; 
ELSE 
Op: s= Op. * {15, 14, 13, 12, 11,. 10, 9, 8}; 
Op := Op + {5, 4, 3, 2}; 
END; 
END; 
(Dest .Mode = SR) AND (Src.Mode = Imm) THEN 
IF (Size67 IN AddrModeB) 


AND (EA0Se IN AddrModeB) 


AND 
IF 10 IN Op THEN 
Error (DestLoc, 
ELSE 
Op 
Op 
END; 
END; 


Op * {15, 


:= Op + {6, 5, 


; 


(CARDINAL 
2)3 


dToLong 
(InstSize, 
(nO, 4); 
nO > 4 THEN 
FOR i 1 TO 4 DO 
ObjOp[i + 4] 
ObjOp [i] 
END; 


(Op), 


ExtL[iJ]; 


i 


:= GetInstModeSize (Src. 


Src.Value; 


Src 


:= GetInstModeSize (Dest .Mode, 


Dest Dest .Value; 


14, 


(Exten IN AddrModeB) THEN 
(* NOT ANDI/EORI/ORI *) 
ModeErr) ; 


13, 
3, 


12; 


4, 2); 


Ob Op) + 


Mode, Size, 


Size, 


11, 


9, 8h; 


InstSize) ; 


InstSize) ; 


+ (BITSET (Dést.Rn * 512) 4 


(* AND mask *) 
(* OR mask 


®) 


(* AND mask *) 
(* OR mask *) 


(* move ObjOp -- make room for extension *) 
ObjOp lil; 
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IF Quick THEN 


InstSize := 2; 
nS := 0; nD := 0; 
END; 


CardToLong (InstSize, AddrAdv) ; 
END MergeModes; 


TYPE 
DirType = (None, Org, Equ, DC, DS, Even, End); 


PROCEDURE ObjDir (OpCode : TOKEN; SrcOp : OPERAND; Size : SizeType; 

VAR AddrCnt, ObjOp, ObjSrc, ObjDest : LONG; 

VAR nA, no, ns, nD : CARDINAL) : DirType; 
(* Generates Object Code for Assembler Directives *) 


VAR 
Dir : DirType; 
i, j : CARDINAL; 
LongString : ARRAY [1..20] OF INTEGER; 


BEGIN 
AddrAdv := LZero; 


IF CompareStr (OpCode, "ORG") = 0 THEN 
GetValue (SrcOp, AddrCnt); 
AddrBoundW (AddrCnt) ; 
Dir := Org; 

ELSIF CompareStr (OpCode, "EQU") = 0 THEN 
GetValue (SrcOp, ObjSrc); 


nS := 8; 
Dir := Equ; 
ELSIF CompareStr (OpCode, "DC") = 0 THEN 
CASE Size OF 
Word : AddrBoundW (AddrCnt) ; 
| Long : AddrBoundL (AddrCnt); 
| Byte : ¢ 
END; 
IF SrcOp[0] = Quote THEN (* String constant *) 

TempC := Length (SrcOp); 

IF TempC > 2 THEN 
InstSize := TempC - 2; (* Don't count the Quotes *) 

END; 

i. g= 13 3 := 20; 

WHILE i <= InstSize DO (* Change from ASCII to LONG *) 
CardToLong (ORD (SrcOp[i]), TempL); 
LongString[j] := TempL[2]; 

LongString[j - 1] := TempL[1]; 
INC (i); DEC ©, 2)3 

END; 

i f= 13 INC (3)? 

WHILE j <= 20 DO (* Left Justify String *) 
LongString[i] := LongString[j]; 


INC (i); INC (3)7 
END; 
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DEC (i); 

WHILE i > 16 DO (* Transfer 2 bytes to OpCode *) 
ObjOp[(i - 16] := LongString[i]; 
INC (nO); DEC (i); 

END; 


WHILE i > 8 DO (* Transfer 4 bytes to Source Operand *) 
ObjSrc[i - 8] := LongString[i]; 
INC (nS); DEC (i); 

END; 


WHILE i > 0 DO (* Transfer 4 bytes to Destination Operand *) 


ObjDest[i] := LongString[il]; 
INC (nD); DEC (i); 
END; 


IF SrcOp[InstSize + 1] # Quote THEN 
Error ((SrcLoc + InstSize + 1), OperErr); 
END; 
ELSE (* not a string constant *) 
GetValue (SrcOp, ObjSrc); 


InstSize := ORD (Size); 
nS := InstSize * 2; 
END; 
CardToLong (InstSize, AddrAdv) ; 
nA := 6; 
Dir := DC; 
ELSIF CompareStr (OpCode, "DS") = 0 THEN 
GetValue (SrcOp, AddrAdv) ; 
nA := 6; nS := 2; ObjSre := LZero; 
Dir <= DS? 
ELSIF CompareStr (OpCode, "EVEN") = 0 THEN 
AddrBoundW (AddrCnt) ; 
Dir := Even; 
ELSIF CompareStr (OpCode, "END") = 0 THEN 
nA := 6; 
Dir := End; 
ELSE 
Dir := None; 
END; 


RETURN (Dir); 
END ObjDir; 


PROCEDURE AdvAddrCnt (VAR AddrCnt : LONG); 
(* Advances the address counter based on the length of the instruction *) 
BEGIN 
LongAdd (AddrCnt, AddrAdv, AddrCnt) ; 
END AdvAddrCnt; 


PROCEDURE GetObjectCode (Label, OpCode : TOKEN; 

SrcOp, DestOp : OPERAND; 

VAR AddrCnt, ObjOp, ObjSrc, ObjDest : LONG; 

VAR nA, no, ns; nD : CARDINAL) ; 
(* Determines the object code for the operation as well as the operands *) 
(* Returns each (up to 3 fields), along with the length of each. *) 
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VAR 
Dummy : BOOLEAN; 
Dir : DirType; 


BEGIN 
AddrAdv := LZero; 
InstSize := 0; 
nA := 0; nO := 0; nS := 0; nD := 0; 
IF Length (OpCode) = 0 THEN 
(* ensure no code generated *) 
RETURN; 
END; 


GetSize (OpCode, Size); 


Dir := ObjDir (OpCode, SrcOp, Size, 
AddrCnt, ObjOp, ObjSrc, ObjDest, 
nA, no, nS, nD Me 


IF (Length (Label) # 0) AND (Dir # Equ) THEN 
(* Check for phase error *) 
Dummy := ReadSymTab (Label, TempL, Dummy) ; 
IF LongCompare (TempL, AddrCnt) # 0 THEN 
Error (0, Phase); 
END; 
END; 


IF Dir = None THEN (* Instruction *) 
AddrBoundW (AddrCnt) ; 

ELSE 
RETURN; 

END; 


Instructions (OpCode, OpLoc, Op, AddrModeA, AddrModeB) ; 
Sre.Loc := SrcLoc; Dest.Loc := DestLoc; 

GetOperand (SrcOp, Src); (* Src & Dest are RECORDS *) 
GetOperand (DestOp, Dest); 


IF DecBr IN AddrModeA THEN (* Decrement & Branch *) 
IF Src.Mode # DReg THEN 
Error (SrcLoc, ModeErr); 


END; 

BrValue := Dest.Value; 

TempL := AddrCnt; 

TempC := 32767; (* Maximum Branch *) 

LongInc (TempL, 2); (* move past instruction for Rel Adr Calc *) 


IF LongCompare (BrValue, TempL) < 0 THEN 


RevBr := TRUE; 

LongSub (TempL, BrValue, BrValue); 

INC (TempC) ; (* can branch 1 farther in reverse *) 
ELSE 

RevBr := FALSE; 

LongSub (BrValue, TempL, BrValue); 
END; 


CardToLong (TempC, TempL) ; (* Maximum Branch distance *) 
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IF LongCompare (BrValue, TempL) > 0 THEN 
Error (DestLoc, BraErr); 

END; 

IF RevBr THEN (* Make Negative *) 
LongSub (LZero, BrValue, BrValue) 

END; 


CardToLong (4, AddrAdv); 


nA := 6; nO := 4; nS := 4; 
CardToLong (CARDINAL (Op + BITSET (Src.Rn)), ObjOp); 
ObjSrce := BrValue; 
RETURN; 
END; 


IF Brnch IN AddrModeA THEN (* Branch *) 
BrValue := Src.Value; (* Destination of Branch *) 
TempL := AddrCnt; 
LongInc (TempL, 2); 


IF Size # Byte THEN (* Byte Size ---> Short Branch *) 
TempC := 32767; (* Set maximum branch distance *) 
ELSE 
TempC := 127; 
END; 


CASE LongCompare (BrValue, TempL) OF 


Sb. se (* Reverse Branch *) 

RevBr := TRUE; 

INC (TempC) ; (* can branch 1 farther in reverse *) 

LongSub (TempL, BrValue, BrValue); 
| +1 3:  (* Forward Branch *) 

RevBr := FALSE; 

LongSub (BrValue, TempL, BrValue); 
| 0 : IF Size = Byte THEN 

Error (SrcLoc, BraErr); 

END; 

END; 


CardToLong (TempC, TempL) ; 


IF LongCompare (BrValue, TempL) > 0 THEN 
Error (SrcLoc, BraErr); 
END; 


IF RevBr THEN 
LongSub (LZero, BrValue, BrValue); (* Make negative *) 


END; 


IF Size # Byte THEN 


InstSize := 4; 
nS := 4; 
ObjSre := BrValue; 
ELSE 
InstSize := 2; 
Dummy := LongToInt (BrValue, TempI); 


Op := Op + (BITSET (TempI) * {7, 6, 5, 4, 3, 2, 1, 0})3 
END; 
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nA := 6; nO := 4; 
CardToLong (InstSize, AddrAdv) ; 
CardToLong (CARDINAL (Op), ObjOp); 


RETURN; 
END; 
nA := 6; 
IF (Op = JMP) OR (Op = JSR) THEN (* Allows for 'JMP.S' *) 


IF (Size = Byte) AND (Src.Mode = AbsL) THEN 
Src.Mode := AbsWw; 
END; 
END; 
MergeModes (SrcOp, DestOp, ObjOp, ObjSrc, ObjDest, nO, nS, nD); 
END GetObjectCode; 


BEGIN (* MODULE Initialization *) 
LongClear (LZero); (* Used as a constant *) 
AddrCnt := LZero; 


Pass2 := FALSE; 
END CodeGenerator. 


Listing 10.10 


IMPLEMENTATION MODULE Listing; 
(* Creates a program listing, including Addresses, Code & Source. *) 


FROM Files IMPORT 
FILE, Write; 


FROM LongNumbers IMPORT 
LONG, LongPut; 


FROM Parser IMPORT 
TOKEN, Line; 


FROM SymbolTable IMPORT 
ListSymTab; 


FROM Conversions IMPORT 
CardToStr; 


FROM Strings IMPORT 
Length; 


IMPORT ASCII; 


CONST 
LnMAX = 55; 
VAR 
LnCnt : CARDINAL; (* counts number of lines per page *) 


PgCnt : CARDINAL; (* count of page numbers *) 
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PROCEDURE WriteStrF (f 
(* Writes a string to the file *) 


VAR 
i: CARDINAL; 


BEGIN 
i := 0; 
WHILE Str[i] # OC DO 
Write (f, Str[i]); 
INC (i); 
END; 
END WriteStrF; 


PROCEDURE CheckPage (f FILE); 


(* Checks if end of page reached yet -- 


VAR 

i: CARDINAL; 

PgCntStr ARRAY [0..6] OF CHAR; 
BEGIN 


INC (LnCnt); 

IF LnCnt >= LnMAX THEN 
LnCnt := 1; 
INC (PgCnt); 
Write (f, ASCII.ff); 


FOR i := 1 TO 60 DO 
Weite (fh, © “)% 
END; 


WriteStrF (f, "Page "); 
WriteStrF (f, PgCntStr); 


END; 
FOR i := 1 TO 3 DO 
Write (f, ASCII.cr); 
Write (f, ASCII.1f); 
END; 
END; 
END CheckPage; 


PROCEDURE StartListing (f 


FILE; Str 


(* Form 
IF CardToStr (PgCnt, PgCntStr) THEN 


FILE); 


ARRAY OF CHAR) ; 


if so, advances to next page. *) 


Feed for new page *) 


(* Print New Page Number *) 


(* Sign on messages for listing file -- initialize *) 


BEGIN 
Write (f, ASCII.ff); 


WriteStrF (f, " 
Write (f£, ASCII.cr); 
Write (f, ASCII.1f); 


WriteStrF (f, " 
Write (f, ASCII.cr); 
Write (f, ASCII.1f£); 


Write (f, ASCII .cr); 
Write (f, ASCII.1f); 


(* Start on a clean page *) 


68000 Cross Assembler") ; 


Copyright (c) 


1985 by Brian R. Anderson"); 
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LnCnt := 1; 
PgCnt := 1; 
END StartListing; 


PROCEDURE WriteListLine (f : FILE; 
AddrCnt, ObjOp, ObjSrc, ObjDest : LONG; 
nA, no, nS, nD : CARDINAL) ; 
(* Writes one line to the Listing file, Including Object Code *) 


CONST 
ObjMAX = 30; 


VAR 
i: CARDINAL; 


BEGIN 

IF nA = 0 THEN (* nA is always either 0 or 6. Address field = 8 *) 

FOR i := 1 TO 8 DO 
Witte (£5 " ')3 

END; 

ELSE 
LongPut (f, AddrCnt, 6); 
mriie: (£, " ")% 
WEite (f; ° "Fi 

END; 


LongPut (f, ObjOp, nO); 
LongPut (f, ObjSrc, nS); 
LongPut (f, ObjDest, nD); 
i := 8 + nO + nS + nD; 
WHILE i < ObjMAX DO 
Write (f, ' ")3 
INC (i); 
END; 


WriteStrF (f, Line); 
Write (f, ASCII.cr); 
Write (f, ASCII.1f); 


CheckPage (f); 


END WriteListLine; 


PROCEDURE WriteSymTab (f : FILE; NumSym : CARDINAL); 
(* Lists symbol table in alphabetical order *) 


VAR 
Label : TOKEN; 
Value : LONG; 
i, j, pos : CARDINAL; 


BEGIN 
LnCnt := 1; 
INC (PgCnt); 
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WriteStrF (f, " * * * Symbolic Reference Table 


FOR i := 1 TO 3 DO 
Write (f, ASCII.cr); 
Write (f, ASCII.1f); 

END; 


pos := 1; 

FOR i := 1 TO NumSym DO 
ListSymTab (i, Label, Value); 
WriteStrF (f, Label); 

FOR j := Length (Label) TO 8 DO 
Write (f, ' '); 

END; 

WriteStrF (f, ": ")? 

LongPut (f, Value, 8); 

INC (pos); 

IF pos > 3 THEN 
pos := 1; 
Write (f, ASCII.cr); 
Write (f, ASCII.1f); 
CheckPage (f); 

ELSE 
WriteStrF (f, " | ")s 

END; 

END; 


Write (f, ASCII.cr); 

Write (f, ASCII.1f); 

Write (f, ASCII.ff); 
END WriteSymTab; 


END Listing. 


Listing 10.11 


IMPLEMENTATION MODULE Srecord; 
(* Creates Motorola S-records of program: 


(& SO = header record, 
(* S2 = code/data records (24 bit address), 
ee S8 = termination record (24 bit address). 


FROM Files IMPORT 
FILE, Write; 


FROM Strings IMPORT 
Length; 


FROM LongNumbers IMPORT 
LONG, LongAdd, LongSub, LongInc, LongDec, 
LongCompare, CardToLong, LongPut; 


IMPORT ASCII; 


CONST 
CountMAX = 16; 
SrecMAX = CountMAX * 2; 
XrecMAX = SrecMAX; 


% 
x) 
*) 


x) 


LongClear, 


*"); 
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VAR 
StartAddr : LONG; (* address that record starts on *) 
TempAddr : LONG; (* running address of where we are now *) 
CheckSum : LONG; 
Count : CARDINAL; (* count of HEX-pairs in S-record *) 
Sdata : ARRAY [1..SrecMAX] OF INTEGER; (* S-record data, HEX digits *) 
Sindex : CARDINAL; (* index for Sdata array *) 
Xdata : ARRAY [1..XrecMAX] OF INTEGER; (* Overflow for Sdata *) 
Xindex : CARDINAL; (* index for Xdata array *) 
Boundary : BOOLEAN; (* marks Address MOD 16 boundary of S-record *) 
LZero : LONG; (* used as a constant = 0 *) 
PROCEDURE Complement; (* CheckSum *) 
BEGIN 
LongSub (LZero, CheckSum, CheckSum) ; (* 2's Complement *) 
LongDec (CheckSum, 1); (* Make it 1's Complement *) 


END Complement; 


PROCEDURE AppendSdata (Data : LONG; n : CARDINAL) : BOOLEAN; 
(* Transfers data to Sdata, and updates Count & CheckSum. *) 


(* If no room: Data goes to Xdata & FALSE returned. *) 
VAR 
T : LONG; (* temporary -- used only as a 2 digit HEX number *) 
BEGIN 
T := LZero; 


WHILE (n # 0) AND (Count # CountMAX) AND (NOT Boundary) DO 


Sdata[Sindex] := Data[n]; 
Sdata[Sindex - 1] := Data[n - 1]; 
T(2] := Data[n]; T({1] := Data[n - 1]; 


LongAdd (T, CheckSum, CheckSum) ; 
DEC. (ny 2) % 

DEC (Sindex, 2); 

INC (Count); 


LongInc (TempAddr, 1); 


IF TempAddr[1] = 0 THEN (* i.e., TempAddr MOD 16 = 0 *) 
Boundary := TRUE; 
END; 
END; 


IF (Count = CountMAX) OR (Boundary) THEN 
WHILE n > 0 DO (* Add Data to Xdata (in reverse) *) 
INC (Xindex) ; 
Xdata[Xindex] := Data[n]; 
DEC (n); 
END; 


RETURN FALSE; (* Sdata is full *) 
ELSE 
RETURN TRUE; 
END; 
END AppendSdata; 
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PROCEDURE DumpSdata (f : FILE); 
(* Writes an S2 record to the file *) 


VAR 


T : LONG; (* temporary -- used to output Count & CheckSum *) 
i, j : CARDINAL; 


BEGIN 
IF Count = 0 THEN 
RETURN; (* nothing to dump *) 
END; 


Write (f, 'S"); 
Write (f, '2')? 


CardToLong (Count + 4, T); (* extra for Address & Checksum *) 
LongPut (f, T, 2)7 
LongAdd (T, CheckSum, CheckSum) ; (* Add Count to CheckSum *) 


LongPut (f, StartAddr, 6); 
(* Add Address to CheckSum *) 


Tf s= DZernoy : 

T{1] := StartAddr[1]; T[2] := StartAddr[2]; 
LongAdd (T, CheckSum, CheckSum) ; 

T(1) := StartAddr([3]; T(2] := StartAddr[4]; 
LongAdd (T, CheckSum, CheckSum) ; 

T({1] := StartAddr[5]; T(2] := StartAddr [6]; 


LongAdd (T, CheckSum, CheckSum) ; 


IF Count < CountMAX THEN (* adjust short record -- shuffle down *) 
j :=1; 
FOR i := Sindex + 1 TO SrecMAX DO 
Sdata[j] := Sdata[i]; 
INC (4); 
END; 
END; 
LongPut (f, Sdata, Count * 2); (* S-record Code/Data *) 


Complement; (* CheckSum *) 
LongPut (f, CheckSum, 2); 


Write (f, ASCII.cr); 
Write (f, ASCII.1f); 


LongInc (StartAddr, Count); 


Sindex := SrecMAX; 
Count := 0; 

Boundary FALSE; 
CheckSum LZero; 


END DumpSdata; 


PROCEDURE GetXdata; 
(* Transfer Xdata into new Sdata line -- N.B.: Xdata stored in reverse *) 


VAR 
i: CARDINAL; 
T : LONG; 
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BEGIN 
i s= 1; 
T := LZero; 
(* No need for either of the tests (CountMAX or Boundary) =) 
(* used in AppendSdata. GetXdata is only ever called *) 
(* after DumpSdata and is therefore only putting (up to 20) *) 
(* HEX digits in an empty buffer (which could hold 32). *) 


WHILE i < Xindex DO 
Sdata[Sindex] := Xdata[i]; 
Sdata[Sindex - 1] := Xdata[i + il¢ 


T[2] := Sdata[Sindex]; T(1] := Sdata[Sindex - 1]; 
LongAdd (T, CheckSum, CheckSum) ; 
INC (i, 2); 


DEC (Sindex, 2); 

INC (Count) ; 

LongInc (TempAddr, 1); 
END; 


Xindex := 0; 
END GetXdata; 


PROCEDURE StartSrec (f : FILE; SourceFN : ARRAY OF CHAR); 
(* Writes SO record (HEADER) and initializes %) 


VAR 
T : LONG; (* temporary *) 
i : CARDINAL; 


BEGIN 
Write (f, 'S'); 
Write (f, '0"'); 


CheckSum := LZero; 

Count := Length (SourceFN) + 3; (* extra for Address & Checksum *) 
CardToLong (Count, T); 

LOngPut {f£, T, 2) 

LongAdd (T, CheckSum, CheckSum) ; 


LongPut (f, LZero, 4); (* Address is 4 digit, all zero, for SO *) 


A. ace Oy 

WHILE SourceFN[i] # OC DO 
CardToLong (ORD (SourceFN[i]), T); 
LongAdd (T, CheckSum, CheckSum) ; 
LongPut (f, T, 2); 
INC (i); 

END; 


Complement; (* CheckSum *) 
LongPut (f, CheckSum, 2); 


Write (f, ASCII.cr); 
Write (f, ASCII.1f); 


Sindex := SrecMAXx; 
Xindex := 0; 
Count := 0; 


Boundary := FALSE; 
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CheckSum := LZero; 
StartAddr := LZero; 
TempAddr := LZero; 


END StartSrec; 


PROCEDURE WriteSrecLine (f : FILE; 
AddrCnt, ObjOp, ObjSrc, ObjDest : LONG; 


nA, no, nS, nD : CARDINAL) ; 
(* Collects Object Code -- Writes an S2 record to file if line is full *) 
VAR 
dummy : BOOLEAN; 
BEGIN 
IF nA = 0 THEN 
RETURN; (* Nothing to add to S-record *) 
END; 
IF Xindex # 0 THEN 
Get Xdata; (* transfers Xdata into Sdata *) 


END; 
IF LongCompare (AddrCnt, TempAddr) # 0 THEN 
DumpSdata (f); 


END; 


IF Count = 0 THEN 


StartAddr := AddrCnt; 
TempAddr := AddrCnt; 
END; 


dummy := AppendSdata (ObjOp, nO); 
dummy := AppendSdata (ObjSrc, nS); 
IF NOT AppendSdata (ObjDest, nD) THEN 
DumpSdata (f); 
END; 
END WriteSrecLine; 


PROCEDURE EndSrec (f : FILE); 
(* Finishes off any left-over (Partial) S2 line, *) 


(* and then writes S8 record (TRAILER) *) 
BEGIN 
IF Xindex # 0 THEN 
GetXdata; 
END; 


DumpSdata (f); 


Write (f, 'S'); (* Fixed format for S8 record *) 
Write (f, '8'); 
Write (f, "O°? 
Write (f, '4'); 
Write (f, '"0O'); 
Write (f, '0'); 
Write (f, '0'); 
Write (f, '0'); 
Write (f, '0O'); 
Write (f, '0'); 
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Write (f, 'F'); 

Write (f; 'C")? 

Write (f, ASCII.cr); 

Write (f, ASCII.1f); 

Write (f, ASCII.cr); 

Write (f, ASCII.1f£); 
END EndSrec; 


BEGIN (* Initialization *) 
LongClear (LZero); 
END Srecord. 


Listing 10.12 


IMPLEMENTATION MODULE CmdLin2; 
(* Parses command line - returns pointer to an array of pointer to strings *) 


FROM SYSTEM IMPORT 
ADDRESS, ADR; 


CONST 
MAXARGS = 5; 


VAR 
CommandLine[80H] : ARRAY [0..7FH] OF CHAR; 
Arguments : ARRAY [0..MAXARGS - 1] OF ADDRESS; 


PROCEDURE ReadCmdLin (VAR ArgC : CARDINAL; VAR ArgV : ADDRESS); 
(* Gives count of items in command line, and an array of pointer to them *) 


VAR 
i, C : CARDINAL; 


BEGIN 
IF ORD (CommandLine[0]) = 0 THEN 
ArgC := 0; (* Nothing in Command Tail Buffer *) 
ArgV := NIL; 
ELSE 
d= 1; € 3= 0; 
LOOP 
WHILE CommandLine[i] = ' ' DO (* Skip Blanks *) 
ENG) Cisse 
END; 
IF CommandLine[i] = 0C THEN (* end of tail buffer *) 
EXIT; 
ELSE 
Arguments[C] := ADR (CommandLine[i]); 
INC (C); 
IF C = MAXARGS THEN 
EXIT; 
END; 


END; 
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WHILE CommandLine[i] # ' ' DO (* Advance to next Argument *) 
INC (i); 
IF CommandLine[i] = OC THEN 
EXIT; 
END; 
END; 
CommandLine[i] := 0C; (* Terminate Argument *) 
INC (i); 
END; (* LOOP *) 
CommandLine[0] := 0C; (* Command Tail must only be used once *) 
ArgC := C; 
ArgV := ADR (Arguments) ; 


END; 
END ReadCmdLin; 


END CmdLin2. 


Listing 10.13 


DEFINITION MODULE LongNumbers; 
(* Routines to handle HEX digits for the xX68000 cross assembler. *) 
(* All but LongPut and LongWrite are limited to 8 digit numbers. *) 


FROM Files IMPORT 
FILE; 


EXPORT QUALIFIED 
LONG, LongClear, LongAdd, LongSub, LongInc, LongDec, 
LongCompare, CardToLong, LongToCard, LongToInt, LongPut, 
LongWrite, StringToLong, BinStrToLong, AddrBoundL, AddrBoundW; 


CONST 
DIGITS = 8; 
BASE = 16; 
TYPE 
LONG = ARRAY [1..DIGITS] OF INTEGER; 


PROCEDURE LongClear (VAR A : LONG); 
(* Sets LONG to Zero *) 


PROCEDURE LongAdd (A, B : LONG; VAR Result : LONG); 
(* Add two LONGs, giving Result *) 


PROCEDURE LongSub (A, B : LONG; VAR Result : LONG); 
(* Subtract two LONGs (A - B), giving Result *) 


PROCEDURE CardToLong (n : CARDINAL; VAR A : LONG); 
(* Converts CARDINAL to LONG *) 


PROCEDURE LongToCard (A : LONG; VAR n : CARDINAL) : BOOLEAN; 
(* Converts LONG TO CARDINAL, returns FALSE if conversion impossible *) 
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PROCEDURE LongToInt (A : LONG; VAR n : INTEGER) : BOOLEAN; 
(* Converts LONG to INTEGER, returns FALSE if conversion impossible *) 


PROCEDURE LongInc (VAR A : LONG; n : CARDINAL); 
(* Increment LONG by n *) 


PROCEDURE LongDec (VAR A : LONG; n : CARDINAL); 
(* Decrement LONG by n *) 


PROCEDURE LongCompare (A, B : LONG) : INTEGER; 
(* Returns: 0 if A= 8B, -1 if A < B, +1 if A >B *) 


PROCEDURE LongPut (f : FILE; A : ARRAY OF INTEGER; Size : CARDINAL) ; 
(* Put LONG number in FILE f *) 


PROCEDURE LongWrite (A : ARRAY OF INTEGER; Size : CARDINAL); 
(* Write LONG number to console screen *) 


PROCEDURE StringToLong (S : ARRAY OF CHAR; VAR A : LONG) : BOOLEAN; 
(* Converts a string (in HEX) into a LONG *) 


PROCEDURE BinStrToLong (S : ARRAY OF CHAR; VAR A : LONG) : BOOLEAN; 
(* Converts a string (in Binary, maximum of 16 bits) into a LONG *) 


PROCEDURE AddrBoundL (VAR A : LONG); 
(* Forces Address to a 68000 long word boundary *) 


PROCEDURE AddrBoundW (VAR A : LONG); 
(* Forces Address to a 68000 word boundary *) 


END LongNumbers. 


Listing 10.14 


DEFINITION MODULE Parser; 
(* Reads the Source file, and splits each *) 
(* line into Label, OpCode & Operand(s). *) 


FROM Strings IMPORT 
STRING; 


FROM Files IMPORT 
FILE; 


EXPORT QUALIFIED 
TOKEN, OPERAND, Line, LineCount, OpLoc, SrcLoc, DestLoc, LineParts; 


CONST 
TokenSize = 8; 
OperandSize = 20; 


TYPE 
TOKEN = ARRAY [0..TokenSize] OF CHAR; 
OPERAND = ARRAY [0..OperandSize] OF CHAR; 


VAR 
OpLoc, SrcLoc, DestLoc : CARDINAL; 
Line : STRING; 
LineCount : CARDINAL; 
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PROCEDURE LineParts (f : FILE; VAR EndFile : BOOLEAN; 
VAR Label, OpCode : TOKEN; 
VAR SrcOp, DestOp : OPERAND); 
(* Reads Line, breaks into tokens, on-passes to symbol & code generators *) 


END Parser. 


Listing 10.15 


DEFINITION MODULE ErrorXx68; 
(* Displays error messages for X68000 cross assembler *) 


FROM Files IMPORT 
FILE; 


EXPORT QUALIFIED 
ErrorType, ErrorCount, Error, WriteErrorCount; 


TYPE 
ErrorType = (Dummy, TooLong, NoCode, SymDup, Undef, SymFull, Phase, 
ModeErr, OperErr, BraErr, AddrErr, SizeErr, EndErr); 


VAR 
ErrorCount : CARDINAL; 
PROCEDURE Error (Pos : CARDINAL; ErrorNbr : ErrorType) ; 


(* Displays Error #ErrorNbr, then waits for any key to continue *) 


PROCEDURE WriteErrorCount (f : FILE); 
(* Error count output to Console & Listing file *) 


END ErrorXx68. 


Listing 10.16 


DEFINITION MODULE SymbolTable; 
(* Initializes symbol table. Maintains list of all labels, *) 
(* along with their values. Provides access to the list. *:) 


FROM LongNumbers IMPORT 
LONG; 


FROM Parser IMPORT 
TOKEN; 


EXPORT QUALIFIED 
FillSymTab, SortSymTab, ReadSymTab, ListSymTab; 
PROCEDURE FillSymTab (Label : TOKEN; Value : LONG; VAR Full : BOOLEAN) ; 


(* Add a symbol to the table *) 


PROCEDURE SortSymTab (VAR NumSyms : CARDINAL) ; 
(* Sort symbols into alphabetical order *) 


A 68000 CROSS-ASSEMBLER 277 


PROCEDURE ReadSymTab (Label : ARRAY OF CHAR; 

VAR Value : LONG; VAR Duplicate : BOOLEAN) : BOOLEAN; 
(* Passes Value of Label to calling program -- returns FALSE if the *) 
(* Label is not defined. Also checks for Multiply Defined Symbols *) 


PROCEDURE ListSymTab (i : CARDINAL; VAR Label : TOKEN; VAR Value : LONG); 
(* Returns the i-th item in the symbol table *) 


END SymbolTable. 


Listing 10.17 


DEFINITION MODULE OperationCodes; 
(* Initializes lookup table for Mnemonic OpCodes. Searches the table *) 
(* and returns the bit pattern along with address mode information. x) 


FROM Parser IMPORT 
TOKEN; 


EXPORT QUALIFIED 
ModeTypeA, ModeTypeB, ModeA, ModeB, Instructions; 


TYPE 

ModeTypeA = (RegMem3, (* 0 = Register, 1 = Memory *) 
Ry02, (* Register Rx -- Bits 0-2 *) 
Rx911, (* Register Ry -- Bits 9-11 *) 
Data911, (* Immediate Data -- Bits 9-11 *) 
CntR911, (* Count Register or Immediate Data *) 
Brnch, (* Relative Branch *) 

DecBr, (* Decrement and Branch *) 
Data03, (* Used for VECT only *) 
Data07, (* Branch & MOVEQ *) 
OpM68D, (* Data *) 

OpM68A, (* Address *) 

OpM68C, (* Compare *) 

OpM68x, (* XOR *) 

OpM68S, (* Sign Extension *) 
OpM68R, (* Register/Memory *) 
OpM37) ; (* Exchange Registers *) 

ModeTypeB = (Bit811, (* BIT operations - bits 8/11 as switch *) 
Size67, (* 00 = Byte, 01 = Word, 10 = Long *) 
Size6, (* 0 = Word, 1 = Long *) 

Sizel1213A, (* 01 = Byte, 11 = Word, 10 = Long *) 
Sizel213, (* 11 = Word, 10 = Long *) 
Exten, (* OpCode extension required *) 
EAOSa, (* Effective Address - ALL *) 
EAO5b, (* Less 1 *) 

EAOSc, (* Less 1, 11 *) 

EAO5d, (* Less 9, 10, 11 *) 

EAOS5e, (* Gess: 1 9, arOl, dil) 

EAOS£, (% Bess 0, Ly 3, 4. 11 *) 
EA05x, (* Dual mode - OR/AND *) 

EAOSy, (* Dual mode - ADD/SUB *) 
EA05z, (* Dual mode - MOVEM *) 


EA611); (* Used only by MOVE *) 
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ModeA 
ModeB 


SET OF ModeTypeA; 
SET OF ModeTypeB; 


i 


PROCEDURE Instructions (MnemonSym : TOKEN; 

OpLoc : CARDINAL; VAR Op : BITSET; 

VAR AddrModeA : ModeA; VAR AddrModeB : ModeB); 
(* Uses lookup table to find addressing mode & bit pattern of opcode. *) 


END OperationCodes. 


Listing 10.18 


DEFINITION MODULE SyntaxAnalyzer; 
(* Analyzes the operands to provide information for CodeGenerator *) 


FROM LongNumbers IMPORT 
LONG; 


FROM Parser IMPORT 
OPERAND; 


EXPORT QUALIFIED 


SizeType, OpConfig, OpMode, Xtype, (* TYPEs *) 
GetValue, GetSize, (# PROCEDURE's ey 
GetInstModeSize, GetOperand, GetMultReg; (* PROCEDURE's *) 
TYPE 
OpMode = (DReg, (* Data Register *) 
ARDir, (* Address Register Direct *) 
ARInd, (* Address Register Indirect *) 
ARPost, (* Address Register with Post-Increment *) 
ARPre, (* Address Register with Pre-Decrement =o | 
ARDisp, (* Address Register with Displacement *) 
ARDisxX, (* Address Register with Disp. & Index *) 
AbswW, (* Absolute Word (16-bit Address) *) 
AbsL, (* Absolute Word (32-bit Address) *) 
PCDisp, (* Program Counter Relative, with Displacement *) 
PCDisX, (* Program Counter Relative, with Disp. & Index *) 
Imm, (* Immediate *) 
MultiM, (* Multiple Register Move *) 
SR, (* Status Register *) 
CCR, (* Condition Code Register *) 
USP, (* User's Stack Pointer *) 
Null); (* Error Condition, or Operand missing *) 


Xtype = (X0, Dreg, Areg); 
SizeType = (SO, Byte, Word, S3, Long); 


OpConfig = RECORD (* OPERAND CONFIGURATION *) 
Mode : OpMode; 
Value : LONG; 


Loc : CARDINAL; (* Location of Operand on line *) 

Rn : CARDINAL; (* Register number *) 

Xn : CARDINAL; (* Index Reg. nbr. *) 

Xsize : SizeType; (* size of Index *) 

X : Xtype; (* Is index Data or Address register? *) 


END; 
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PROCEDURE GetValue (Operand : OPERAND; VAR Value 
(* determines value of operand (in Decimal, HEX, 


PROCEDURE GetSize (VAR Symbol : ARRAY OF CHAR; 
(* determines size of opcode: Byte, Word, or Long *) 


PROCEDURE GetAbsSize (VAR Symbol : ARRAY OF CHAR; 


(* determines size of operand: Word or Long *) 
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LONG) ; 
rx via Symbol Table) *) 


VAR Size : SizeType); 


VAR AbsSize : SizeType); 


PROCEDURE GetInstModeSize (Mode : OpMode; Size SizeType; 

VAR InstSize : CARDINAL) : CARDINAL; 
(* Determines the size for the various instruction modes. a 
PROCEDURE GetOperand (Oper : OPERAND; VAR Op OpConfig) ; 


(* Finds mode and value for source or destination operand *) 


PROCEDURE GetMultReg (Oper : OPERAND; PreDec 


Loc : CARDINAL; VAR MultExt 


BOOLEAN; 
BITSET) ; 


(* Builds a BITSET marking each register used in a MOVEM instruction *) 


END SyntaxAnalyzer. 


Listing 10.19 


DEFINITION MODULE CodeGenerator; 


(* Uses information supplied by Parser, OperationCodes, 


(* and SyntaxAnalyzer to produce the object code. 


FROM Parser IMPORT 
TOKEN, OPERAND; 


FROM LongNumbers IMPORT 
LONG; 


EXPORT QUALIFIED 


LZero, AddrCnt, Pass2, BuildSymTable, AdvAddrCnt, 


VAR 
LZero, AddrCnt : LONG; 
Pass2 : BOOLEAN; 


PROCEDURE BuildSymTable (VAR AddrCnt : LONG; 


Label, OpCode : TOKEN; 


a 
*) 


Get Ob jectCode; 


SrcOp, DestOp : OPERAND) ; 


(* Builds symbol table from symbolic information of Source File *) 


PROCEDURE AdvAddrCnt (VAR AddrCnt : LONG); 


(* Advances the address counter based on the length of the instruction *) 


PROCEDURE GetObjectCode (Label, OpCode : TOKEN; 
SrcOp, DestOp : OPERAND; 
VAR AddrCnt, ObjOp, ObjSrc, 


VAR nA, nO, 


ObjDest : LONG; 


nD : CARDINAL) ; 


(* Determines the object code for the operation as well as the operands *) 
(* Returns each (up to 3 fields), along with their length *) 


END CodeGenerator. 
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Listing 10.20 


DEFINITION MODULE Listing; 
(* Creates a program listing, including Addresses, Code & Source. *) 


FROM Files IMPORT 
FILE; 


FROM LongNumbers IMPORT 
LONG; 


EXPORT QUALIFIED 
StartListing, WriteListLine, WriteSymTab; 


PROCEDURE StartListing (f : FILE); 


(* Sign on messages for listing file -- initialize *) 


PROCEDURE WriteListLine (f : FILE; 
AddrCnt, ObjOp, ObjSrc, ObjDest : LONG; 
nA, no, nS, nD : CARDINAL) ; 
(* Writes one line to the Listing file, Including Object Code *) 


PROCEDURE WriteSymTab (f : FILE; NumSym : CARDINAL); 
(* Lists symbol table in alphabetical order *) 


END Listing. 


Listing 10.21 


DEFINITION MODULE Srecord; 


(* Creates Motorola S-records of program: oe) 
es SO = header record, *) 
(* S2 = code/data records (24 bit address), *) 
(* S8 = termination record (24 bit address). *) 


FROM Files IMPORT 
FILE; 


FROM LongNumbers IMPORT 
LONG; 


EXPORT QUALIFIED 
StartSrec, WriteSrecLine, EndSrec; 


PROCEDURE StartSrec (f : FILE; SourceFN : ARRAY OF CHAR); 
(* Writes SO record (HEADER) and initializes *) 


PROCEDURE WriteSrecLine (f : FILE; 
AddrCnt, ObjOp, ObjSrc, ObjDest : LONG; 
nA, no, nS; nD : CARDINAL) ; 
(* Collects Object Code -- Writes an S2 record to file if line is full 


*) 
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PROCEDURE EndSrec (f : FILE); 
(* Finishes off any left-over (Partial) S2 line, *) 
(* and then writes S8 record (TRAILER) *) 


END Srecord. 


Listing 10.22 


DEFINITION MODULE CmdLin2; 
(* Parses command line - returns pointer to an array of pointer to strings *) 


FROM SYSTEM IMPORT 
ADDRESS; 


EXPORT QUALIFIED 
ReadCmdLin; 


PROCEDURE ReadCmdLin (VAR ArgC : CARDINAL; VAR ArgV : ADDRESS); 
(* Gives count of items in command line, and an array of pointer to them *) 


END CmdLin2. 
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68000 
Coding Conventions 


Jan Steinman 


Assembly language programming requires 

a special sort of discipline, especially if others will be 
modifying your code. The practical advice offered below 
will help make the process easier. 


o you have your Motorola reference manual, a shiny new machine, and 

you're tired of running games written in BASIC. What happens next? If 
you're reading this, you probably aren't the sort of person who is content with 
the canned programs on the market, and you probably get a big kick out of 
writing programs that run as efficiently as possible. The answer is, of course, 
assembly code. But before jumping into coding tricks, let's think a bit about 
what makes good assembly code. 


Practices Make Perfect 

Assembly coding practices are ways of dealing with the problem, and the 
MC68000 fosters a certain style of coding practice. I will present a number of 
ideas about good coding style, but it is up to you to come up with a set of 
standards you can live with. It really helps to be consistent, as you will 
appreciate when you first look through you own "pre-standards" code after 
adopting some standards. 

Modular design is the best way to manage assembly coding projects. Due 
to its inherently unstructured nature, assembly code often wanders about 
through branches and jumps, in what is known as "spaghetti code," which is 
a maintenance nightmare! 

Good modular design requires short code sequences that perform a single 
function in each sequence. Such "modules" should generally be less than one 
page in length. Also important is having a single entry point and a single 
exit point, although this is not a firm rule. 
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Once you have decided to use modular design, a form of structured com- 
mentary called the module specification is important. It can vary widely in 
content, but should include, as a minimum: 

+ a brief statement of the purpose of the module; 

* the conditions that need to be set up before the module is entered; 

* the conditions produced as a result of the execution of the module; 

* side effects, such as register usage, global variables affected, amount of 
stack needed and so on. 

Also desirable, especially for multiuser projects and for modifying existing 
code, are: 

* a revision log, giving the modifier's name, the date modified and a brief 
description of the changes made; 

* a list of global or external references; 

* a list of callers of this routine and routines called: 

* exceptional termination conditions and the results they produce; 

* box drawings of key data structures; 

* pseudo-code description of module's operation. 

Figure 11.1 shows a template that might be used for a module specification. 


KKK Dinglewaite FRI III II I III I IO I kk ae dee 
KKK KK 

* This module performs certain functions that should be described in this 

* paragraph, but since this is only a template, it can't really, now, can 
* Ft? 


* 

* Entry: a0 -- pointer to part of the input data. 

* al -- pointer to some more of the input data. 
be dO -- flag used to signal alternate functions. 
* 

* Exit: a2 -- pointer to the output data generated. 

* 

* Uses: dl -- intermediate results. 

* d2 -- mask for MSB. 

* 

* Stack: requires 40 bytes. 

* 

* Global: none used. 

* 

* Called: Init, system startup code. 

Exit, system exit code. 

* Furblesnotzer, mutually recursive call. 

* 

x Calls: Furblesnotzer routine to obtain Hrair's constant. 
* 

is Log: 

* 860703 Jans: Initial entry. 

* 860705 jans: Massive revisions refurbishing the furblesnotzer section. 
* 860706 bay: Undid stupid changes done in haste by jans. 


FIGURE 11.1 A module specification template. 
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Non-specification commentary is extremely important, and the care with 
which it is written has a direct impact on the understandability of the code. A 
comment per line, plus one or more comment lines per block of "difficult" 
code, is recommended. The most easily understood comments read like a 
story, sort of a running commentary on what the code is doing. "Useless" 
comments are those that merely put the assembly code into words— 
INCREMENT D0 and MOVE A4 INTO A6, for example. 

I spend nearly as much time on the commentary as I do on the code, but 
find that it takes nearly twice as long to make use of previously written code 
that has poor commentary. If you ever plan to reuse your old code, the time 
spent producing good commentary is free! 


Why Do I Need Standards? 


Writing assembly code that is both fast and maintainable seems an 
impossible task. There are historical reasons why assembly code has lacked 
standards. Many crack assembly coders, often treated with reverence usually 
reserved for movie stars and sports heroes, consider "fast" and "maintainable" 
to be contradictory, and cite their expertise as justification for producing 
"“write-only" (unreadable) code. Writing maintainable code has traditionally 
been seen as an impediment to job security. ("If you can write your way into 
a job, why write yourself out of one?" as someone once said.) This argument 
loses, however, when an exciting new project comes along and the maintainer 
isn't available because no one will take over maintenance of his code. 

For hobbyists, the "standards problem" is even more prevalent. Small 
operating systems found on today's affordable 68000-based systems lack 
facilities for code revision control, and few hobbyists can afford hard disks. 
These limitations have led to poor documentation and little commentary, as 
hobbyists try to squeeze every last byte out of their systems. This practice is 
often regretted when bugs are discovered or improvements are desired. 

The use of coding standards is every bit as important as the actual code 
written. Such standards foster communication of the intent of the code—not 
only to other users but to the author—and subsequently speed coding and 
reduce the occurrence of bugs and design flaws. 


Different Styles for Different Applications 

After you decide on a project, decompose it down into modules and set the 
standards to use, the next step is to examine the needs of the application for 
things that will impact coding style. Is the code going in ROM? Will it be 
used on a multitasking system? Will it be used recursively? Will it need to 
work at different locations? 

The Commodore Amiga, for example, will run your code at an arbitrary 
location that is different each time you run the program. A position- 
independent program will load much faster in such an environment. Code that 
is reentrant can be shared among different tasks, or be called recursively. Code 
in ROM must pay attention to separating initialized data from data in RAM, 
and usually must be kept as small as possible. 
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Declaration of Position Independence 

Position-independent (PI) code can be loaded and run without modification 
at any arbitrary location in memory. Such code is desirable in systems that do 
not have memory management hardware. Such systems often have relocating 
loaders, which take non-PI code and sum an offset into all the addresses found 
in the code. Although adjusting the addresses is generally not very time- 
consuming, avoiding the relocation process entirely often speeds program 
loading. 

Of course, if you are at all interested in assembly coding, it is probably 
because of speed or space limitations. Here's a nice bonus: PI code is often 
both faster and smaller than non-PI code, especially if the target data has an 
address that cannot be expressed as a simple displacement—that is, at absolute 
addresses less than 8000 hex. Figure 11.2 offers a comparison of PI and non- 
PI data accesses. 

It is when data needs to be written that PI code incurs overhead. PC- 
relative addressing is not allowed as a destination operand on the 68000. One 
way to overcome this problem is to use an extra LEA instruction to resolve 
the destination address before it is written. However, writing a PC-relative 
location in this manner produces non-reentrant code. The optimal solution is 
closely related to reentrancy issues. 


Reentrant Modules Can Be Shared 

In multitasking systems, it is often desirable to have a single copy of code 
that can be used by different tasks. Library routines are often sharable among 
tasks, and such things as display format and graphics routines need only have 
a single copy of the code in memory. Multiuser systems have an even greater 
need for sharable code; if each of 50 users on a Unix minicomputer needed a 
separate copy of the shell in memory, it would quickly run out of memory! 


CharTab DC.B '0123456789ABCDEF' 


* Position independent indexed data references use PC relative addressing. 


move.b CharTab (pc, d5), (a0) + 4 bytes, 18 clocks 
* 
* If "CharTab" is in the first 32k, Register relative with displacement 
* addressing is slightly faster. 
* 
move.b CharTab(d5), (a0)+ 4 bytes, 16 clocks 
* 
* If "CharTab" has an address greater than $7FFF, register relative cannot be 
* used. Two instructions and a scratch register are needed. Addresses over 
* SFFFF incur even more overhead. 
* 
lea CharTab,al CharTab<=SFFFF 4 bytes, 6 clocks 
* CharTab>$FFFF 8 bytes, 12 clocks 
move.b (al,d5), (a0) + 4 bytes, 18 clocks 
* Total: 8-12 bytes, 24-30 clocks 


FIGURE 11.2 Comparison of PC-relative indexed and absolute indexed. 
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Reentrant-capable code does not change in any way as a result of being 
executed. This behavior ensures that the code in one task can be interrupted, a 
second task can completely execute the same code and the first task can then 
resume execution of the interrupted code without ill effect. 

The key to writing reentrant code is to avoid modifying persistent data. 
That means that global variables are out, that all parameters must be passed 
on the stack and that local variables must be written in a way that keeps them 
from getting clobbered if the module is reentered. 

The typical way to provide reentrant-capable writable storage is to allocate 
it on the processor stack, a procedure that has the side benefit of producing 
position-independent code. This results in smaller load modules than when 
large, static buffers are used, and the code is inherently ROMable because all 
writable storage is segregated from constants and program code. An often- 
overlooked benefit is that writing reentrant-capable code tends to produce 
structured, modular code because the programmer cannot arbitrarily access data 
in other modules. 

Motorola knew programmers would want to write reentrant code and 
included the LINK and UNLK instructions for that purpose. To use these 
instructions, you have to know how much room you'll need for your writable 
data. It is a good idea to use a EQU directive so that changes will only have 
to be made in one place. Note that the size used in the LINK instruction must 
be negative, or previously written stack data will get destroyed. 

The LINK instruction is then used with a register and the size needed. The 
register used will then be the base of the local storage area, or the frame 
pointer. Indices or offsets from the frame pointer now refer to your local data 
area, which can be written and read in a position-independent manner. The 
frame pointer will eventually be needed by the UNLK instruction, so don't 
change its value unless you can get it back when needed. Figure 11.3 shows 
an example of using the LINK and UNLK instructions to create local storage 
and subsequent use of the storage. 

Reentrant-capable code does have a few disadvantages. When speed is more 
important than reusability, the overhead of the LINK and UNLK instructions 
can be saved and an extra register that would normally be used for the frame 
pointer is available. Local variables that are used only once are expensive in 
reentrant-capable code, and it is a bit harder to write. But unless speed is the 
overriding issue, it is good practice to write reentrant-capable code whenever 
possible. 


Recursion in Assembly Code 

Writing recursive assembly code routines is not nearly as hard as it seems. 
Recursion is nothing but a special case of reentry. If you don't actually need 
reentrant-capable code because of reasons cited earlier, the rules are different 
when writing recursive code. For one thing, you know exactly where the code 
will be "interrupted"—namely, at the recursive call. If you're taking the 
trouble to write recursive assembly code, you probably don't want the 
overhead of saving and restoring all the processor registers at each recursive 
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* For position-independent, writable storage, use the LINK and UNLK 

* instructions to allocate space on the stack. Then use register 

* relative addressing to access the storage. A6 is traditionally used 
* for this purpose. 


CharTab DC.B '0123456789ABCDEF' 

TABSIZE EQU *-CharTab 

STR EQU 0 First local variable, string, 

INT EQU STR+TABSIZE of proper size for CharTab.Second var 

CHAR EQU INT+4 is an integer Third local variable 

SHORT EQU CHAR+1 is a character Fourth local variable 

LOCAL_1 EQU SHORT+2 is 16 bits. 

Modulel link a6, #-LOCAL 1 Create local storage area for Modulel. 
move.w  #TABSIZE-1,d0 Get the size of the source 

data, 

loop1: move.b CharTab (pc, d0), (a6,d0) move it using PC relative, 
dbra a0, loop1 until it's all gone. 


move.b STR(a6),CHAR(a6) Move local variables: 
move. #$1234, SHORT (a6) 
move.1 #$56789ABC, INT (a6) 


= 


unlk aé Release the storage back to the stack, 
rts before leaving the module. 


FIGURE 11.3 Example of position-independent, writable storage. 


call. Only registers that represent persistent data (data that will be needed after 
the recursive call) need be saved. Such code is called serially reusable code. 

Recursive solutions are generally not as efficient as iterative ones. In some 
cases, though, eliminating the recursion is difficult, particularly if there is 
more than one recursive call. When working in assembly code, the overhead 
of recursion can be reduced considerably by identifying and saving only the 
persistent data. These factors combine to make recursion attractive for some 
assembly code problems. 


Jump Tables 

A typical way to change flow of control when there is more than a two- 
way choice is with a jump table. Such tables are quite easy to code using 
absolute addressing, but cannot easily be loaded in locations other than the 
original location. Relative jump tables, (branch tables) require that the 
number stored in the jump table be relative to some point in the code, usually 
the base of the jump table itself. Note that the branch table can use 16-bit 
entries if all its modules are within 32K of the table, whereas jump tables can 
use 16-bit entries only if all the routines are in the first 32K of memory. 
Figure 11.4 shows how jump tables and branch tables are constructed. 
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AbsTab DC.L SlartyBartfarst Addresses of modules. 
DC.L _Hexadecimalize 
DC.L _Quicksort 
RelTab DC.W SlartyBartfarst-RelTab Offsets of modules. 
DC.W _Hexadecimalize-RelTab 
DC.W _Quicksort-RelTab 


FIGURE 11.4 Absolute jump and relative branch tables. 


kK KK DispatcherAbs FI II IO I IORI IO IO A kk ae 
KKK KKK 
* Jump to the routine selected by an index. The jump table must be in the 


* first 32k of memory space. 
* 


* Entry: a0 -- index of routine to jump to. 
* 
DispatcherAbs 
add.1 a0,a0 Make simple index into a word 
offset, 
add.1l a0,a0 then into a longword offset. 
move.1 AbsTab (a0) ,a0 Fetch the address from the jump 
table, 
jump (a0) and jump to it. 


* 10 bytes, 40 clocks. 


Kak DispatcherRel eee eee ee ee ee ee ee ne 
Ka KK KK 

* Branch to the routine selected by an index. The Branch table can be 
anywhere 

* in memory, but must be within 32k of the Dispatcher and all dispatched 


* routines. 
* 


* Entry: a0 -- index of routine to branch to. 
* 
DispatcherRel 

add.1l a0,a0 Make simple index into a word 
offset, 

move.w RelTab (pc, a0.w),a0 fetch the branch offset from 
table, 

jmp RelTab(pc,a0.w) and jump, summing in the table base. 


* 10 bytes, 36 clocks. 


FIGURE 11.5 Comparison of absolute and relative jump table dispatchers. 


The table dispatcher gets a bit more complicated when coded in a position- 
independent manner. There are two memory references that must be resolved 
in a position-independent manner: the address of the jump table entry and the 
address of the module referenced in that entry. PC-relative indexed addressing 
is used in each reference, and the ability to use just the lower half of a register 
as an index saves four clock cycles. Figure 11.5 shows the code for each type 
of dispatch routine. If you can assume that all of your routines will be in the 
first 32K of memory space, a 16-bit absolute dispatch routine will be faster 
although much less flexible. 
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In Figures 11.4 and 11.5, calling either Dispatcher with 1 in a0 will 
execute the routine _Hexadecimalize; using 0 will cause the routine 
SlartyBartfarst to be executed. 

Jump tables that are built dynamically would, of course, have to be con- 
structed in a stack frame in order to be reentrant capable. 


Mixing Assembly Code With Other Languages 

The lack of structure inherent in assembly code tends to keep us re- 
inventing wheels. After spending hours getting a tight, efficient assembly 
code module to work properly, it seems a waste to use it in only one program 
or application. If you have used the methods outlined for writing reentrant- 
capable code, you have licked most of the hard part of high-level language 
(HLL) interface. Although most HLLs do not demand reentrant-capable code, 
it tends to be easier to interface. I'll use C for illustration; common procedural 
languages should be similar. 

C uses the stack for parameter passing and uses stack frames for local 
variables, exactly as shown earlier. The order in which parameters are passed 
on the stack is not specified by the C language in order to allow compiler 
writers to choose the most efficient way. Needless to say, your 68000 
assembly modules will not be portable to other processors, and you should be 
aware that they may not even be portable to other 68000-based C compilers. 

Determining the parameter-passing protocol for your C compiler will 
require some experimentation on your part. Figure 11.6 is a simple test 
program that can be used to discover the way your compiler passes 
parameters. Many compilers have an option that causes the compiler to emit 
assembly code for inspection. If yours does not, you will have to actually 
execute the test program using a disassembling debugger. If you don't have a 
disassembling debugger, you may just have to dump the object code, sit down 
with the Motorola book and disassemble it by hand. 

Figure 11.7 shows the assembly code produced by the Uniflex C compiler. 
It is obvious that this compiler pushes the parameters on the stack in reverse 
order. This method turns out to be the most common; it is probably safe to 
assume that your 68000 C compiler behaves the same if you can't actually 
check it. If you plan to do this often, you will probably want to build an 
"include" file that will have symbolic labels for the proper offsets into the 
stack for the various parameters. 

There are other "gotchas" associated with mixing assembler with HLL. 
Many compilers reserve certain registers for special purposes, and most that 
allow register variables expect the called module to save and restore any 
registers it uses. Your favorite assembly routine might well break some 
programs and not others by not following the compiler's conventions. 


Let's Get Practical 

By now you have enough knowledge to get dangerous, and the examples 
presented have probably stirred some thought. Let's put together some of the 
things we've covered to come up with a practical, working program. 
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/* 
* Parameter Test Program. This program is compiled to assembly code 
* for examination, NOT executed. 


*/ 


dummy (1, 2, 3); /*Pass some things to dummy routine.*/ 


FIGURE 11.6 Test to determine parameter-passing protocol. 


global main 
_main link a6, #- (4) 
move.1 #$3,da2 Moves the last parameter first, 
move.1 d2, (sp) 
move.1 #$2,da2 followed by the middle, 
move.1 d2,-(sp) 
move.l1 #$1,d2 and the first. 
move. 1 d2,- (sp) 
jsxr _ dummy 
add.1 #8,sp 
unlk aé 


rts 


FIGURE 11.7 Assembler output of parameter test program. 


Listing 11.1 shows a module that converts a 32-bit quantity to eight 
hexadecimal ASCII characters. This is a "naive" implementation, using 
straightforward computation to get the job done. First, the arguments are 
copied from the stack into two registers. Constants for masking the least 
significant nybble and for the number of nybbles to convert are then loaded 
into registers. Now everything is ready for the main loop of the module. 

A copy of the integer to convert is shifted by a different amount each time 
through the loop, in order to process the most significant nybble first. Next, 
the shifted integer is masked for the least significant nybble. On line 12, the 
decision of whether the nybble is one of 0 through 9 or A through F is made. 
If the nybble is not expressible as a decimal digit, it has a special constant 
added to it—namely, the difference between the end of the digit sequence and 
the beginning of the alpha sequence in the standard ASCII set. 

In either case, the magic constant needed to turn a nybble into an ASCII 
character is summed in at line 1A. The resultant ASCII character is then put 
into the output string, the number of bits to shift is decremented for the next 
time through and the loop is repeated—unless the shift count was already 0, 
in which case the loop is exited and the module returns to whatever module 
called it. 

This version of Hexadecimalize requires 38 bytes and 1552 clock cycles, 
on the average, to execute. The number of clock cycles will vary depending 
on how many non-decimal characters are produced. There are 10 instructions 
consuming an average of 94.25 clock cycles each time through the loop. 
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An Improved Hexadecimalizer 

A problem with the computation based Hexadecimalize is that the output 
data is not a simple function of the input data. A test and a decision must be 
made on the input data in order to chose one of two methods of generating the 
output. This test is in the crucial inner loop. If we can eliminate the test, we 
should be able to speed things up considerably. 

A simple way to make the output data a simple function of the input data 
is to use a constant lookup table. Listing 11.2 shows such an imple- 
mentation. Lines 10 through 1A are much the same as in the previous 
version; arguments are copied into registers, and shift and mask constants are 
loaded. 

Now, instead of shifting the integer to be converted by huge amounts, we 
simply rotate the most significant nybble into the least significant nybble. 
Each time through the loop the next less significant nybble is shifted in. 
Next a copy is made, which is masked for the nybble of interest. 

Now our constant table comes into play. Using PC-relative addressing for 
position independence, we simply index into the table and pick out the proper 
character. This character is placed in the output buffer and a special looping 
instruction is used to repeat until our count of nybbles to process is 
exhausted. 

This version of Hexadecimalize is superior for several reasons. Because of 
the simple algorithm, the number of bytes produced is easily changed. (It is 
left as an exercise to the reader to make the number of bytes produced a 
parameter that is passed into the module!) Although slightly larger at 44 
bytes, it is nearly twice as fast, executing in 880 clock cycles. The loop 
contains only six instructions, consuming 52 clocks each time through. This 
can be stated with certainty, since the number of clock cycles is deterministic, 
not dependent on the input data. 


Recursive Quicksort 

Now that we've demonstrated position independence, we might as well do 
something useful to demonstrate some features of reentrant-capable code. One 
of the more feared assembly code tasks is recursion, which requires careful 
thought about reentrancy issues. Recursive code is guaranteed to be reentered, 
albeit in a more controlled fashion than via rude interrupts. 

When was the last time you needed to sort something? When was the last 
time you sorted in some HLL and found it too slow? One of the most 
efficient general-purpose sorting algorithms, called Quicksort, was invented in 
1960 by C. A. R. Hoare. Quicksort is of special interest as an example of the 
appropriate use of recursion in assembly code; the recursion is not easily 
removed because it calls itself in two separate places. Because we are using 
assembly code, we can dispense with some of the overhead associated with 
recursion in HLLs and come up with an efficient version in spite of the 
recursion. 

Briefly, Quicksort operates under the "divide and conquer" idea. Small sets 
of data can be sorted much quicker than large data sets, and Quicksort 
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alternately sorts and divides through the data until it has used each item in the 
data set as a comparison key, at which point it returns up through the 
recursive calls. 

The algorithm uses an arbitrarily chosen partition value for dividing the 
array to sort. The array is then scanned from both ends, looking for value 
pairs that are "backwards" with respect to the partition value. The out-of- 
sequence items are then swapped. The scanning and swapping continues until 
the scanning pointers meet. At that point, a recursive call is made to sort the 
lower part of the array, followed by a recursive call to sort the upper part. 

In Listing 11.3, lines 0 and 4 copy the arguments off the stack into 
registers. The first thing that must be done is to see if it is time to quit 
recursing! If not for the compare and branch on lines 8 and A, this module 
would never return. The partition value is arbitrarily taken as the last item in 
the area to be sorted, and pointers to the first and last items to be sorted are 
copied into the registers that will be scanning from either end. 

The action begins at line 12, which is the top of both the outer loop and 
the first of the inner loops. Lines 12 and 14 start from the bottom and 
compare each value with the partition value, looking for a value that is 
greater than or equal to the partition value. When it is found, line 16 backs 
the pointer up so that it will be pointing to the item of interest, not past it. 

Similar but opposite action happens at lines 18 and 1A. Starting at the 
top, values are compared to the partition value, stopping when one is found 
that is less than or equal to the partition value. 

Now we have two pointers pointing at values that lie on either side of a 
partition value. Before swapping them, we need to check (in lines 1C and 1E) 
that our pointers have not already crossed. Assuming that the pointers have 
not crossed, we then swap the values pointed to by the scanning pointers in 
lines 20 through 24 before repeating the outer loop. 

If the pointers had been found to have crossed in the test on line IC, it 
would be time to exit the loop and recurse on the two partitions. Lines 28 
through 2C swap the final value in order to provide the recursive call with a 
new partition value. 

The best part is coming. Assembly code gives the programmer the freedom 
to violate the rules imposed by HLLs when efficiency is of paramount 
importance. In a HLL implementation of Quicksort (such as the C 
implementation in the right margin of Listing 11.3), the recursive call would 
save on the stack all the registers that are used in the module. This time- 
consuming action is the biggest cause of recursion's poor reputation for 
speed. However, all our local variables are in registers, and we need to save 
only those that will be needed after the recursive call. It turns out that only 
two pieces of information, Is+1 and r, need to be saved. Lines 2E and 30 push 
these values on the stack in preparation for the second recursive call. Line 34 
calculates the value Is-1 while moving it into the proper register for the first 
recursive call. 

Since there is no more state to save here, we can pass the parameters for 
the first recursive call in registers—something you simply cannot do in most 
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Random Array 

0000198C 000063BF 00002B78 00002D2E 00000474 0000387C 0000420C O0000E88 
00004795 0000281A 0000364A 00003226 00000742 00001A4C 00002778 00000836 
00005526 0000218C 000066CF 00002102 00005940 000017F2 OODD6EDD 00006B00 
000066D6 0000611B 000036D9 00005DB4 OOD06EOF 000057D3 O00005B7C 00002DCA 
00005091 00000C30 00003528 00002942 00003110 00002572 00007F38 000021C6 
00001A85 O0005CBO 00007109 000075EB 00007F00 OO00054FE 00005E57 O0000DEB 
OOO0058FO 00002641 0000310D OO004A6F OOOO6GAFB OO002ZE2E 00006272 00001DB6 
00001B8D 00003DBA OO0000FC1 000041DE 0000765D 000043B1 00000136 00007334 


Sorted Array 

00000136 00000474 00000742 00000836 00000C30 OOO000DEB O0000E88 O00000FC1 
000017F2 0000198C 00001A4C 00001A85 00001B8D 00001DB6 00002102 0000218C 
000021C6 00002572 00002641 00002778 0000281A 00002942 00002B78 00002D2E 
00002DCA 00002E2E 0000310D 00003110 00003226 00003528 0000364A 000036D9 
0000387C 00003DBA 000041DE 0000420C 000043B1 00004795 OO0004A6F 00005091 
OO00054FE 00005526 000057D3 000058F0 00005940 00005B7C 00005CBO 00005DB4 
OO0005E57 0000611B 00006272 O00063BF OOD066CF 000066D6 OODD6GAFB 00006B00 
OOOO6EOF OOOD06EDD 00007109 00007334 000075EB 0000765D 00007F00 00007F38 


FIGURE 11.8 Output of test driver program. 


HLLs. In order to do this, we use a second entry point at line 8, which 
slightly violates the rules of modularity but saves four stack accesses (two 
pushes before the call and two reads after the call). Note that PEA and LEA 
can be used to sum constants with registers, storing the results in different 
registers. Upon return from the second recursive call, the arguments are 
removed from the stack and control returns to the caller in lines 3C and 3E. 

One of the problems with assembly code is the lack of tools for testing 
and debugging. It is no fun single-stepping through a huge assembly code 
project, peering at hexadecimal values that should be decimal or ASCII. For 
this reason alone, calling assembly code from C makes debugging much 
easier. Listing 11.4 is a short test driver that gives a quick check of both the 
Hexadecimal and the Quicksort modules. 

The routine main() sets up an array of random integers to sort and calls 
dump() to display the results. Getting this much functionality out of 
assembly code would have taken many times longer to write and debug, and 
would have detracted from the real problem at hand: getting Hexadecimalize 
and Quicksort working. Figure 11.8 shows the output produced by the test 
driver and shows that some hexadecimal values were indeed sorted. Although 
this test is not exhaustive, it would have taken much longer to get just this 
far if assembly code were the only tool available. 


Summary 

Assembly code is not a general-purpose tool. It requires self-discipline in 
areas that are normally mechanically enforced in HLLs. Local storage 
management for re-entrance capability, careful thought to addressing modes 
and table structures for position independence, and careful use of commentary 
for documenting the types of things that are documented free in HLLs are all 
things that must be done if assembly code projects are to be successful. But 
the reward is great—that wonderful feeling you get from knowing that 
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something you wrote is running just as fast as is possible without changing 
processors. 
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A Simple 
Multitasking Kernel 
for Real-Time 
Applications 


Nicholas Turner 


This chapter presents several techniques and tricks 
that can significantly improve the performance of 
the crucial innermost heart of a multitasking system. 


or some time, we at Terra Nova Communications have been involved in a 
F development project that requires a simple, fast, clean 32-bit micro- 
processor operating system. After a great deal of research, we were unable to 
find a commercial system that met our stringent requirements of extremely 
fast response time (even under a load of 20 users), low price (less than 
$10,000 for both system and software), compact code size (we wanted a 
system kernel, including all the utility routines discussed in this chapter, that 
required less than 20K of object code) and simple programming of appli- 
cations. After some brainstorming, we created a 68000 multitasking kernel 
that met and even exceeded our expectations of speed and compactness. 
Released from hardware requirements by our decision to write the kernel 
ourselves, we decided to use the VME bus hardware configuration because of 
its standardization, complete hardware specification, relatively low price, ease 
of expansion and the availability of lots of high-speed hardware devices. We 
were also impressed with the reliability and ease of use of the Eurocard con- 
nectors used with the VME bus. 


Why Not Use an Existing OS? 

Our requirements for speed and compactness stemmed primarily from the 
need to handle a large number of I/O tasks over serial lines without incurring 
large overheads for interrupt handling, disk access and context switching—that 
is, we needed to be able to do significant amounts of I/O without slowing the 
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system. Our kernel should eventually be able to handle 20 real-time users 
over serial lines at 1200 bps, including full-speed block transfers, with no 
perceptible response-time delay and only minor slowing of the byte-transfer 
rates. We needed a system that would degrade gracefully; rather than pausing 
in midstream as one task takes over the system for an appreciable fraction of a 
second or as tasks are paged in and out, it should slow down gradually as the 
load increases, always providing steady and uniform output, even if it's at a 
reduced byte-transfer rate. None of the commercial systems we examined had 
this property, nor were they able to handle the load required without sig- 
nificant degradation. 

Examination of the code used in several of the commercial kernels we 
sampled showed some interesting reasons for this: Most kernels contained 
code designed to handle all sorts of unlikely circumstances that might arise in 
an environment in which you don't know what sorts of programs might be 
running. Not only did this code add significantly to the size of the kernel but 
it also slowed down the process of context switching between tasks in many 
cases. We needed the fastest possible context switch in order to guarantee that 
minimum system time was spent on this. Fortunately, we knew exactly 
which applications would be running on our system and we were able to 
design a complete application/system interface to make application coding 
easy. Because we knew that all application code running under our kernel 
would be "polite" (would follow all the rules of the interaction between 
application and kernel) and that all source code would be available for 
debugging, we were able to dispense with a lot of the error-handling code 
usually present in commercial kernels. 

We discovered that another contributing factor to the time required for a 
context switch was the magnitude of the context that was switched. In most 
of the systems we examined, all the machine registers were saved and restored, 
and full status information was saved, both of which took up a significant 
amount of processing time. As we'll explain later, we fixed this in a rather 
unorthodox way. 

Further, many commercial kernels required the use of a memory manager 
chip and spent significant amounts of time paging users in and out to 
compensate for a small system memory. We opted against memory man- 
agement, mainly because it solved no problems for us. We didn't need any 
sort of memory protection; in fact, one of the most important criteria was 
that all tasks must be able to quickly read and write data belonging to any 
other task or to the system itself. Also, our memory requirements were not 
large (minimum 512K, expandable to several megabytes) and because of the 
amount of code sharing between tasks, the structure of our data heap and the 
heap's interaction with the disk system, there was no need for hardware paging 
of memory. 


Why Progam the Kernel in Assembly? 
It was clear from the start that in order to get the kind of performance we 
wanted from the system, the inner kernel had to be written in native code. 
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Even compiled C or Forth would have had to be manually "tuned" in 
assembly source code form. Further, we could see several difficulties with 
compiled language—we needed to do so many "tricky" things to extract the 
last few cycles from the system kernel that writing it in compiled code was 
out of the question. By putting the entire kernel in machine code, we sim- 
plified the effort required to make major changes (it's all in the same lan- 
guage) and made possible a far more complete and integrated tuning. 

Eventually we expect to put up at least a C compiler and a Forth in- 
terpreter for faster development, but for the moment all development is in 
assembly for speed and compactness. Because assembly was the language of 
choice, selection of the target processor was the next important issue. 


Why Use the 68000? 


The eventual goal of our project is to provide a responsive telephone dial- 
up system that can support up to 30 or 40 simultaneous calls without sig- 
nificant performance degradation. Such a task requires a truly powerful 
processor, even if the whole system is written entirely in native code. For 
several reasons, we have chosen the 68000 family of processors for our base 
hardware. The most important reason is that the instruction set is extremely 
versatile and powerful. It is in many ways a true 32-bit instruction set, 
although, unless you are using a 68020, you must put up with slightly 
slower memory access for 32-bit reads and writes because of the 16-bit data 
path. 

The instruction set for the 68000 family is nearly orthogonal—that is, 
almost every instruction can be used with any of the 12 addressing modes. 
This is important for a system on which a lot of assembly-language 
development work is to be done because the programs become much easier to 
generate, read and debug. Unfortunately, even the 68000 is not perfect. 
Several times we've encountered annoying restrictions; for example, we've 
often cursed our inability to do a PC-relative store. 

The 68000 also has another advantage for assembly language pro- 
grammers: The memory architecture in its native mode, without the extras 
added by a memory management chip, is perfectly flat. That is, the address 
space is completely continuous from $00 0000 all the way to $FF FFFF, or 
$FFFF FFFF if you have a 68020. For an assembly hacker, this is far more 
desirable than the segmented architecture required with the Z8000 or 80286 or 
with any 8- or 16-bit processor. For people writing in a high-level language, 
this is not an issue because they never deal directly with the memory at all. 
But for us, it's nice to be able to chop up that big address space in any way 
we like. As I'll explain, we have chosen to make it into an enormous heap 
and virtual disk area, thus making the fullest possible use of each and every 
byte. 

Finally, after it became clear that the VME hardware bus was the best bet 
for our needs, the 68000 processor family was the logical choice because of 
the large number of 68000-oriented products available for the VME 
architecture. 
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In the paragraphs that follow I'll describe briefly how our operating system 
fits together and then I'll get to the fun stuff: the tricks and shortcuts we used 
to get such incredible performance out of the 68010 in our system. If you are 
already familiar with how a multitasking operating system fits together, you 
might want to skim down to the tricks and shortcuts. 


Memory Structure 

RAM memory is divided into two areas: the system zone and the heap, as 
shown in Figure 12.1. The system zone begins at memory address 0 with the 
68000 vector list and continues up from there. It's quite small and contains 
only a few data structures. Figure 12.2 diagrams the structure of the system 
zone. 

The 68000 vector list contains all the hardware vectors required for pro- 
cessing of interrupts and exceptions. It requires $300 bytes. The data in the 
vector list is first set up by the initialization section and modified thereafter 
by the I/O manager as device interrupts are added or deleted. 

Above the vector list is a small zone containing miscellaneous system 
variables and pointers. The time of day and date are kept here, along with 
pointers for the various linked lists maintained by the kernel, plus some other 
information that needs to be quickly accessible via absolute short addressing 
(the fastest way to get at a memory location from the 68000). Several I/O 
devices also have data here, where it can be accessed by interrupt routines 
without the overhead of following pointers through memory. 

Following the system variable zone, we have a rather strange beast in 
today's world: an old-fashioned jump table! This table contains more than 100 
absolute-long address mode JMP instructions at the moment—hundreds more 
are planned. 

The next structure in the system zone is the task control buffer (TCB) 
table, which is an array of data structures linked via pointers into several 
doubly linked lists. Each task is associated with one TCB in the TCB array. 
When a task releases control to the next task, the context switcher reads the 
pointer from the outgoing task’s TCB to get the address of the next task's 
TCB. This prevents unnecessary overhead while reading TCBs belonging to 
inactive tasks in the array because they are not part of the active task linked 
list. It also makes possible an extensible TCB array: If the primary array is 
full when another task is about to be spawned, the task manager can allocate a 
nonrelocatable item in the heap and continue the TCB array there. 

Another extensible array follows the TCB table: The master handle array 
contains handles to relocatable heap items. (A handle is the address of a 
pointer.) All accesses to relocatable heap items must be dereferenced (followed 
through the handle to the pointer to the actual heap item) before a task can 
use the item. That way, if an item is relocated by the heap munger (see the 
description of the heap munger, later) during its background heap op- 
timization, any tasks that own the relocated item will still be able to find it 
because the heap munger always fixes the master handle so that it is correct 
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after moving a heap item. This master handle is often located in the master 
handle array, although it doesn't need to be. Wherever it is, though, it must 
be in a non-relocatable place so that the heap munger can follow a pointer 
back from the heap item to its master handle. 

After the last master pointer is an end mark. At the next 32-byte boundary, 
the heap begins. 

The heap (Figure 12.3) takes up all the available RAM beyond the system 
zone. It is a single data structure composed of chunks called items. Every 
single byte of the heap belongs to an item of one sort or another. An item 
can be a deletion or it can contain actual information. Items that contain in- 
formation can be allocated (owned by one or more tasks) or unallocated. 
Unallocated items can be purged by the heap munger (see later) if it needs to 
make more room for a memory request from a task. For speed and ease of 
programming, every single heap item begins and ends on a 32-byte boundary. 

Virtually all system information that doesn't have to be addressed 
absolutely is stored in the heap, in various heap items owned by the system 
kernel. In addition, the heap contains all executable code, including the 
system kernel itself, in chunks called code items. 

Each heap item contains a 32-byte heap header record, followed by zero or 
more 32-byte data blocks. The header record contains the information neces- 
sary for the heap manager and the heap munger to identify, move and validate 
each item. The heap is organized, like much of the rest of the system, as a 
doubly linked list. As the heap munger scans through it, it follows the 
pointers forward or backward to verify that all is in order. If it finds anything 
that is not completely kosher, it immediately stops the system, takes over the 
system console and enters the debugger with a descriptive error message. 
When a bug occurs, it is often the heap munger that detects the problem (in 
the form of a messed-up heap header) before anything else happens. 


Division of Labor 

The kernel is divided into several distinct code blocks of three types: one- 
time routines (initialization), system calls (routines available from every task) 
and discrete tasks (self-contained programs that run under the context switcher, 
just as do application tasks). Some of the system calls we've developed are 
listed in Table 12.1. The table is not a complete list but does give some of 
the system's flavor. 

The initialization section is the block of code that gains control before 
anything else happens. It starts with a brute-force approach: It grabs control 
from whatever operating system invokes it and then sets up the bare-bones 
data structures for the system. I'll go into greater detail about the initialization 
section later, when I talk about tricks and shortcuts. 

The most important low-level code segment is the context switcher, which 
is the routine that receives control from one task and passes it on to the next. 
It is extremely small and extremely fast, and it makes a lot of assumptions 
about the tasks as it does its job. This is by design: By making assumptions 
and forcing the tasks to adhere to them, a lot of overhead is eliminated. 


A SIMPLE MULTITASKING KERNEL FOR REAL-TIME APPLICATIONS 


Call Name 


Spawn 
Kill 
Suicide 


HeapGimme 
HeapDel 
FillZero 
GetMaster 


SendMsg 
Del1Msg 
DelMsgs 
TxtMsg 
GetMsg 
HandleMsg 


DevReq 
DevDemand 
DevRel 
PrToStd 
PrToMem 


GetCommand 


AddCmdTab 
DoCommand 
GetPSW 


MoveString 
GetLine 
PrLine 
Print 


CompString 


Random 
Sart 


Description 


(Task Manager Calls) 

Create a new task 

Destroy a task (by task number) 
Kill the calling task 


(Heap Manager Calls) 

Allocate an item in the heap 
Release (delete) a heap item 
Re-initialize a heap item 

Assign a master handle for an item 


(Message Manager Calls) 

Send a copy of a block of memory to another task 
Delete message from top of incoming queue 

Delete entire incoming message queue 

Send a message of type "TEXT" 

Fetch next message in queue 

Analyze incoming message and handle if standard type 


(Character I/O Manager Calls) 

Request a character I/O channel 

Demand a character I/O channel (usually impolite) 
Release a character I/O channel 

Select this task's standard character I/O device 
Select the MemPrt device (see text) 


(Text Manager Calls) 

Input a command line and parse it, passing control to the 
appropriate routine based on the command 

Add a set of commands to the existing command set 

Parse and execute a command already stored in memory 

Input and encrypt a password (to be compared with an 
encrypted password from the user file) 

Move an ASCII string 

Input a line of ASCII text from the character I/O device 

Print an ASCII string on the character I/O device 

Print a line of text. The line is expected to immediately 
follow the JSR Print instruction 

Compare two ASCII strings 


(Miscellaneous System Calls) 
What system would be complete without random numbers? 
Square root of 32-bit integer 


TABLE 12.1 Some of Terra Nova's system calls. 
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Again, I'll go into detail about the context switcher in the section on tricks 
and shortcuts. 

The task manager is composed of a group of system routines available to 
every task. They allow a task to create, destroy and manipulate other tasks or 
itself. 

The heap manager is a collection of routines that allow any task to request, 
release, lock, enlarge or otherwise manipulate heap items. 

The message manager is a collection of system routines that make pos- 
sible a clean and well-defined message-passing protocol between tasks. 
Simply by pointing at a block of memory and calling a system routine, any 
task can send a copy of any piece of data to any other task. Reading queued 
messages from other tasks is similarly easy. 

The character I/O manager is a set of system calls and interrupt service 
routines. Together with the text manager, it makes possible a simple I/O 
structure in which each task can select any physical or logical device for I/O 
by passing a device number. Serial input is interrupt-driven and serial output 
is polled. Device drivers can be added or removed from the I/O manager with 
another set of system calls. 

The text manager is a collection of system routines that ease the processes 
involved in talking to humans. It includes powerful routines to get and parse 
command strings, as well as text-manipulation routines such as case con- 
version, context-sensitive string comparisons and so forth. The powerful 
parsing calls make it easy to create a tiny machine language task that includes 
a complete command interpreter and syntax error handler. This is very im- 
portant if a significant amount of development is to be done in assembly 
language. 

The trap manager handles all system traps except I/O interrupts, which go 
directly to the I/O manager. Error traps always cause the system to come to a 
complete standstill. This is important to us because of the close interaction 
between the tasks, which must always be in intimate communication to 
fulfill the purpose of the system. When an error trap occurs, all task 
switching and interrupt processing stop and the task in which the error 
occurred takes over the system console and enters the debugger. The human in 
charge can then take corrective action and restart the system with minimal 
damage. Obviously this approach would be completely unacceptable in a 
commercial operating system, but for us it is ideal because we have all the 
source code for the entire system and can often correct bugs as soon as they 
occur. 

No assembly language development system is complete without a de- 
bugger, of course, especially a multitasking one. Our debugger is a command 
parser that any task can invoke, either automatically (in response to an error 
trap) or directly. It is capable of running alongside other active tasks, even 
multiple copies of itself, and it allows the user full manipulation and 
examination of memory. 

The heap munger is a distinct task, always present, always active, whose 
original job was to survey the contents of the heap continuously and maintain 
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it as an efficient data structure (by using a background task for this, we 
avoided many complexities). The heap munger has turned into quite a bit 
more than just a trash compactor, however. Its responsibilities are many and 
varied, from checking the TCB array for integrity and waking up sleeping 
tasks when their ships come in, to responding to messages from other tasks 
that want to know what the system loading is so that they can adjust their 
own CPU usage to increase the overall performance of the system. In fact, the 
heap munger is also capable of responding to text messages sent to it by a 
human who is operating a user task, in which case it responds by sending a 
plain English message back to the source task, which then displays it for the 
human to read. As a general rule, the heap munger performs any systemwide 
activity that must be performed at frequent, regular intervals. With all these 
responsibilities, it is the largest single code segment in our kernel, weighing 
in at about $A00 bytes. 

The disk munger, like the heap munger, is a distinct task that is always 
running except when it's waiting for an I/O completion. Because its structure 
and function are application specific, I won't go into it in great detail here. I 
would like to point out, however, that by allocating a single task to handle 
each disk device, a large number of problems related to data contention be- 
tween tasks can be avoided. The disk munger is entirely message-driven: As 
each task requires a disk access, it sends a message to the disk munger for the 
device it wants to access. Each I/O request gets added to a queue of such 
messages. When the disk I/O is completed, the disk munger sets the "wake- 
up" flag for the requesting task and the heap munger wakes it up on its next 
pass. The requesting task then looks in its TCB for the completion code from 
the disk munger. Because the I/O requests are high-level calls, usually im- 
plicitly including the open, read/write and close of the file, there are few file 
contention problems. 

On start-up, a discrete task called the system console manager initially 
gains control of the system console and puts up a system monitor prompt. 
Until another user task is spawned, it is the only task that has access to a 
character I/O device (and thus, to a human). The system console manager can 
then give way to any other user program. 


Tasks 

Everything that gets done on the system, with the single exception of the 
context switch between tasks, is done by a task. Three tasks are always 
present in the system: the system console task, which initially runs the 
system console manager; the heap munger; and at least one disk munger. The 
disk munger is actually optional, although it's hard for me to imagine doing 
any useful work without using a disk. 

Each task, whether it's active or not, possesses exactly one task data item 
(TData item) in the heap. The TData item is crucially important to the proper 
functioning of the system because it's where each task stores its most 
important local information. Every task's TCB entry contains the master 
pointer to its TData area in the heap. During normal execution of a task, the 
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68000 register a5 always points to the base of the TData item. Before the 
context switcher passes control to a task, it always sets up register a5 for the 
incoming task. 

The TData item is also the location of the task's local data stack: The stack 
goes from the top of the TData item down, and the task's local data goes up 
from the bottom. Note that it is the responsibilty of each task to ensure that 
its stack and TData areas do not collide. 

Each task's TData item contains spaces for various pointers and vectors 
associated with its current character I/O device, if any. The vectors include 
various standardized routines such as InCheck, which checks to see if a char- 
acter is available; InWait, which waits for a character and returns with it; 
OutCheck; OutWait; plus several others. The pointers include the addresses of 
the destination for the next input byte (if any), the source of the next output 
byte and so forth. 

Character input is generally interrupt-driven. Because a task that is await- 
ing input is often "asleep" (not in the active TCB list), it is necessary for the 
interrupt routine to set a flag that tells the system to wake up the owner of 
the device. Then, after the interrupt has been handled, the heap munger, which 
is always awake and active, catches the set flag the next time it gets control 
and performs the actual manipulations to return the task's TCB to the active 
list. In order to provide an input time-out, the heap munger also awakens each 
task once every ten seconds so that the task can test for the time-out. 

Each task may request to be assigned to a character I/O device. If a device 
that is requested is currently assigned to another task, the requestor will 
usually have to wait until the device is available. For special cases such as 
error traps, however, there is a system call that allows the task to demand to 
be connected to a device. When a demanded device is released, it reverts to the 
task (if any) to which it was attached originally. When a requested device is 
released, it always becomes available (unattached) again. Some devices can be 
attached to multiple tasks at the same time—for example, there is a device 
called MemPrt that reads and writes to memory as if it were a character stream 
from/to a serial device. This "virtual" device can be simultaneously active for 
different tasks, each with its own set of pointers in the TData item for reading 
and writing. 

I won't go into great detail about our I/O drivers or the low-level structure 
of the I/O routines; this sort of information is readily available in several 
forms in computer bookstores. 


The Good Stuff: Tricks and Shortcuts 


The first thing our operating system does when it gains control of the 
processor is to deviously remove the existing operating system. This it does 
by using a trick to get into supervisor mode (see Listing 12.1). First, it 
shoves a new address into the privilege exception vector, which is in the 
hardware vector table in low memory. This address happens to be that of the 
second instruction following the one that does the store to the vector. The 
next instruction is a privileged but otherwise harmless one. Thus, when it 
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tries to execute it, it traps to the privilege exception vector and thence to the 
next instruction in the program. If through some quirk we happen to be in 
privileged mode already, the processor harmlessly executes the privileged 
instruction and falls through. Now we are in privileged mode, and we can 
quickly grab the rest of the system. 

Next, we turn off all the interrupts in the system as quickly as possible. It 
is important to do this before clearing memory because a stray interrupt 
might happen before we can turn it off and it must still vector properly. After 
all interrupts are out of commission, we copy our own set of vectors into the 
interrupt table. 

The jump table (discussed later) is next moved into place in low memory. 
It is read from a section of object code within another assembled module. It's 
nothing more than a sequence of more than a hundred absolute-long JMP 
instructions. 

Next, we clear the rest of memory—except the kernel, of course—to 
zeroes. This is both a general safety measure and an aid in debugging: If a 
chunk of memory is nonzero, we can be certain that something we did caused 
it to be that way. Also, it's nice to be able to assume that unused memory is 
always zeroed out; it makes for much faster initializations later on, once the 
system is running. 

The system zone requires a certain amount of initialization. The task 
control blocks must be set up and the end marks for the TCB and master 
handle arrays must be set in place. The values in the miscellaneous system 
data area must also be initialized. 

After the system zone is in place, the heap is defined from the next 32-byte 
boundary to the end of memory. Three heap items are initially set aside: a 
deletion from the beginnning of the heap to the start of the kernel's code, a 
fixed (immovable) code item for the kernel, and another deletion from the end 
of the kernel to the end of memory. Soon, the other tasks will be carving up 
the big deletions for their own use. 

Each time a task is spawned, the task manager creates a new TCB and a 
new TData item. The first task to be spawned is the system console manager. 
The system task manager allocates an item of the proper size from the heap, 
creates a TCB for the new task, and adds the TCB to the current linked list of 
active tasks' TCBs. 

After the system console task is spawned, the heap munger and disk 
munger are also spawned. Note that none of them begins to execute until the 
initialization code jumps into the middle of the context switcher. 

We discovered that one of the biggest sources of "extra" system overhead 
in commercial operating systems is the need to manage tasks that might 
possibly get out of hand and take over the system. In most cases, such 
"runaway" tasks are avoided by using a hardware timer to assure that any 
given task will only be able to run for a preset time. If a task spends too 
much time without releasing to the system, the timer interrupt occurs and 
vectors the CPU through to the supervisor program. This program then saves 
the existing task's registers and status and restores those of the next task in 
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line, which then is off and running with a newly reset timer. All of this 
requires a lot of tricky and complex code to ensure that no task can "run 
away" and lock up the system. Also, the constant saving and restoring of 
registers and status, which is necessary because the context switch is 
interrupt-driven, adds measurably to the overhead required for a context switch. 

We debated for some time about the best way to reduce this overhead. 
Finally, we settled on the answer: We developed our kernel as a non- 
preemptive task controller. This means that there is no hardware timer to 
interrupt each task after some preset interval, and no routines are required to 
service such an interrupt. Instead, we use a simple context switcher, one that 
doesn't even bother to save registers or the previous task's status bits. It is the 
task's responsibility to call the context switcher often enough to ensure 
smooth system operation and to make sure that it saves any registers that it 
needs (other than a5 and the stack pointer). For our application, this is perfect 
because most of our tasks spend most of their time waiting for input or 
output in the inactive task list, during which time the other tasks can run 
unhindered. 

When a task has finished with the CPU and is ready to let the next task in 
the active list run for a while, it simply calls the context switcher as a 
subroutine using a JSR instruction to a low-memory JMP instruction whose 
address is fixed regardless of the location of the context switcher. Even with 
the added overhead of the extra JMP instruction, this method of calling is 
considerably faster than a TRAP instruction—the usual method of calling a 
context switcher. 

Once the switcher has control, it checks to see if the system tasking is 
stopped (see the listing). If it is, then the calling task immediately gets back 
control through a simple RTS. Otherwise, it gets ready to call the next task. 

The address of a task's TCB is in its TData area. This address is loaded into 
a0. Then the current TData base address (in register a5) is subtracted from the 
stack pointer, yielding a relative displacement, which is stored in the TCB. 
Now we're ready to move on to the next task. 

The address of the next task in the circular linked list of active tasks is 

fetched from the old task's TCB into a0. Register a5 is set to point to the new 
task's TData area by moving its address from the TCB, and the stack pointer 
is restored from the relative displacement by adding a5 to it. Then a simple 
RTS returns control to the task. 
This otherwise trivial scheme has one slight complication: Frequently, a task 
will release control with the intention of going to sleep for a while. This 
happens, for example, when the RAM buffer has no input characters for the 
character device attached to a task that is waiting for input. When the input 
finally happens, the interrupt routine sets a flag that causes the heap munger 
to wake up the task when it next checks for such a situation. The problem is 
that the context switcher I have just described has no provision for putting the 
old task to sleep: It assumes that both tasks want to stay awake. 

So, we created an alternative context switcher that removes the outgoing 
task from the active list before calling the new one (see the listing). When a 
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task wishes to go to sleep, it simply calls the alternate context switcher. The 
primary difference is that before it moves on to the next task, the alternate 
switcher does some standard and fast list manipulation. 

Most commercial operating systems use one or more of the TRAP 
instructions to perform system calls, usually going on the premise that they 
are there for that purpose and that the TRAP instructions allow programs to 
be more general and more easily relocated. Unfortunately, TRAP instructions 
on the 68000 cause a major problem: They take a long time to execute, as do 
the instructions to decode their arguments. The RTE instruction, which 
returns from a TRAP, shares the same problem. 

We could understand the importance of relocatable programs, certainly, but 
we felt the time-consuming TRAP instruction was too much to ask of an 
ultra-fast, real-time system. We therefore designed a faster way to call system 
routines: the JSR instruction and an old-fashioned jump table in low memory 
(see Table12.2 for a timing comparision). Instead of using a TRAP 
instruction with an argument following in the next word, we simply call one 
of many entry points that are at absolute locations in low memory. Each 
entry point consists of a single JMP instruction to the actual system call 
entry point. Not only do we save the extra time required for the TRAP and 
RTE instructions but we also avoid having to extract the argument word from 
the bytes following the TRAP instruction and having to add two to return 
address to jump around the argument because the argument is implicit in our 
choice of which routine to call. (With TRAP-based system calls, an argument 
is required to specify which system call to use because there are only 16 traps. 
With JSR calls, there can be hundreds of separate entries, so no argument is 
required to specify which call is to be used.) By using subroutines instead of 


TRAP-Oriented Calls 


Instruction Cycles Used Description 
TRAP #n 38 Call the system routine 
MOVE.L 2(SP),A0 16 Point to word argument 
MOVE.W (A0)+,D0 8 Fetch the argument 
MOVE.L A0,2(SP) 16 Update return address 
(variable) (variable) Decode argument word 
----- Useful code 
RTE 24 Return to caller 


104(+ decode) Total cycles for overhead 


JSR-Oriented Calls 


Instruction Cycles Used Description 
JSR Label.W 18 Call low memory entry point 
JMP Label.L 12 Call actual routine 
----- Useful code 
RTS 16 Return to caller 
46 Total cycles for overhead 


TABLE 12.2 Comparison of TRAP- and JSR-oriented system calls. 
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traps, we shaved more than 100 machine cycles from every single system 
call, which makes a measurable difference in a machine that uses lots of 
system calls. 


Conclusion 

With the two context switchers just described, a small set of carefully 
designed system routines, a somewhat unusual system calling procedure, and 
a certain amount of cooperation from the application programs, we have 
vastly increased the throughput of our system. Our approach is obviously not 
well suited to most projects as it requires a considerable amount of skill and 
cooperation on the part of the programmers. Furthermore, because of its 
nonstandard nature, it is poorly suited to any applications that are written for 
commercial systems—at least until we get a C compiler running! If you need 
an extremely fast multitasking system for a specialized real-time application 
and are strapped for funds, however, this approach can turn a relatively 
inexpensive microcomputer into an amazingly powerful system. To date we 
have done just that for four different hardware configurations. 
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Listing 12.1 


Terra Nova Communications multi-tasking kernel 
Initialization and task-switcher 


Note: this is not intended to be a complete listing. It's only 
a sample of some of the techniques used in our system. 


PSECT Kernel 


External symbols (defined in other code segments) 


EXTERN VecTable, ;vector table for hardware vector list 
JMPTable, ;jump table for system calls 
JMPTabLen, ;length of jump table in longwords 
KernEnd, ;end of kernel code item in heap 
IOInit, your private I/O initialization routine 
HeapInit, jour private heap initialization 
SysInit, ;system variable initializer 
SysConMon, yentry point for system console 

7monitor task 
HeapMunger, yentry point for heap munger task 
DiskMunger yentry point for disk munger task 


Entry points in this module (referenced from elsewhere) 


ENTRY Start, ;primary entry point to boot our OS 
ConSwitch, ;main context switcher 
ConSwSleep yalternate context switcher (puts 


;calling task to sleep) 


Include files (mostly equates) 


INCLUDE SysEqu ;contains the low-memory absolute 
j;address equates (jump table, etc) 

INCLUDE HeapDef ;defines the heap data structure 

INCLUDE SysIO ;contains hardware I/O equates 


; Miscellaneous storage 


CodeHeap DS.L 8 ;heap header for kernel heap item 
StackEnd DS.L 40 ;system stack before tasking starts 
StackBegin DS.L 0 ;top of startup stack area 


Pre-tasking initialization 
this code works in single-task mode 
prior to the invocation of the context switcher 


Se Nee 


Start ;Initial entry. Calling operating system is still 
;alive and kicking at this point. 
TakeOver LEA ReEntry,Al ;point to re-entry instruction 
MOVE .L A1,$20.W ;move short absolute to the vector 
;for privilege exceptions 
MOVE USP, AO jtry a privileged instruction. If it 


j;works, then we're in priv. mode. If not, then trap to 
;ReEntry and be in privileged mode anyway 
ReEntry LEA StackBegin, A7 ;set up initial stack 


Turn off all interrupts in the system 

Note: this is device-specific code. 

The labels in the operand fields are from our own 

. SysIO include file. 

CLR.B FDCIntMask ;clear floppy disk & system console 


See Ne 
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CLR.B HDIntMask ;clear hard disk completion int. mask 
CLR.B SerIO1lIntMask ;clear serial boards 
CLR.B SerIO2IntMask 


Initialize the vector table 
Copy the vectors from an assembled table (in another module) 
into the actual hardware vector list in low RAM 


LEA VecTable,A0O ;source (in another code segment) 
LEA $0.W,Al ;destination (begins at $00 0000) 
MOVE #191,D7 7192 longwords to move 
VecMove MOVE.L (AO) +, (Al) + #move a longword 
DBRA D7, VecMove ;repeat till done (fast loop on 68010) 


3 Copy system routine JMP table from assembled object code (in 
; another module) to low memory jump table, where everyone 
can get at them. 


LEA JMPTable, AO 7 source 
LEA System.W, Al ;dest. (name of first system call in 
7the jump table. "System" is from the SysEqu include 
;file. It's the context switcher) 
MOVE #JMPTabLen/4,D7 ;number of longwords to move 
JPTMove MOVE.L (AO) +, (Al) + 7move a longword 
DBRA D7, JPTMove ;repeat till done (fast loop on 68010) 


; Clear low memory to zero (between jmp table and kernel) 


LEA StackEnd, Al ;point to top of destination 
zand bottom of destination (end of the jump table) 

LEA System+JMPTabLen.W, AO 

SUBA AO,A1 ;calculate the length 

MOVE .L Al,D7 #move to D7 for counting 

LSR.L #4,D7 ;divide by 16 for 16-byte blocks 
LowClr CLR.L (AQ) + 7clear 16 bytes, quickly 

CLR.L (AQ) + 

CLR.L (AQ) + 

CLR.L (AO) + 

DBRA D7, LowClr 7do it until done. 


; Clear high memory to zero (between kernel and end of RAM) 
(RAMEnd is first byte beyond RAM, defined in SysEqu) 
LEA RAMEnd, Al 7point to top of destination 
zand bottom of destination (end of the jump table) 


LEA KernEnd, AO 

SUBA AO,Al1 7calc the length 

MOVE.L A1,D7 smove to D7 for counting 

LSR.L #4,D7 ;divide by 16 for 16-byte blocks 
HiClr CLR.L (AO) + ;clear 16 bytes, quickly 

CLR.L (AO) + 

CLR.L (A0) + 

CLR.L (AO) + 

DBRA D7,HiClr 7do it until done. 


: Initialize all of the primary I/O devices 
; Note: this is a device specific routine not treated in the article. 
JSR IOInit 


; Initialize the heap 

, Note: this is a routine in the heap manager, which creates 

: valid heap headers for the three initial heap items discussed 
; in the text: the deletion below the kernel, the kernel code 

; item, and the deletion above the kernel. 
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JSR HeapInit 
; Initialize the system zone of low memory 
; Note: this sets up the TCB and master handle arrays, as 
; discussed in the text, as well as initializing the time of day and 
; the date and the other miscellaneous system values. 
JSR SysInit 
5 Spawn off the initial tasks 
: This will create TCBs and TData items for the tasks, but won't 
: invoke them. They're invoked only by the context switcher. 
LEA SysConMon, AO 7point to system console entry point 
MOVE.L #4096,D0 7tell it how much RAM for TData 
JSR Spawn #jump through jump table entry 
#("Spawn" is a jump table equate in SysEqu) 
LEA HeapMunger, AO 7spawn the heap munger 
MOVE.L #512,D0 sheap munger's TData size 
JSR Spawn 
LEA DiskMunger, AO 7Spawn the disk munger 
MOVE.L #8192,D0 7(TData includes one disk buffer) 
JSR Spawn 
LEA TCB1.W,A2 #get address of first TCB in array 
+ (TCB1 is defined in SysEqu) 
BRA.S ConSwl ;now start the context switcher! 
; Context Switcher: primary version 
; Simple task-switch, nothing fancy. 
: SysFlags is a low-RAM system flag byte, defined in SysEqu. 
; The data structure for the TData item is defined in SysEqu. 
; The data structure for the TCB is defined in SysEqu. 
ConSwitch BTST #StopSys,SysFlags.W ;task switching inhibited? 
BNE.S ConSwX yyes, exit back to caller 
MOVE .L OurTCB(A5) ,A0 7get TCB address from TData 
SUBA.L A5,SP ;Subtract TData base addr from stack 
MOVE .L SP, TCBSP (AO) ;Save relative displacement in TCB 
MOVE.L TCBNxt (AO) ,A2 7get address of next TCB 
ConSwl MOVE.L TCBA5 (A2) ,A5 7get new TData base address 
MOVE.L TCBSP (A2) , SP 7get stack relative displacement 
ADDA.L AS,SP #restore absolute address 
ConSwX RTS ;return to next task 
z Context Switcher: alternate version 
Z Put the calling task to sleep. 
ConSwSleep BIST #StopSys, SysFlags.W ;task switching inhibited? 
BNE.S ConSwX syes, exit back to caller 
swithout going to sleep 
MOVE.L OurTCB(A5),A0 7;get TCB address from TData 
SUBA.L A5,SP ;subtract TData base addr from stack 
MOVE.L SP, TCBSP (A0) 7Save relative displacement in TCB 
MOVE.L TCBNxt (AO) , A2 ;get address of next TCB 
MOVE.L TCBPrev(A0),Al ;get addr of previous TCB 
MOVE.L A2,TCBNxt (Al) 7close the pointers around the now- 
MOVE.L Al, TCBPrev(A2) ;sleeping task. 
MOVE .B #Sleep,TCBState(A0) ;mark it as asleep 
MOVE .L TCBAS (A2),A5 7get new TData base address 
MOVE.L TCBSP (A2) ,SP #get stack relative displacement 
ADDA.L AS5,SP #restore absolute address 
RTS 7return to next task 
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A Commercial 
Multitasking Kernel 


Steve Passe 


Here's a multitasking kernel for the 68000 that 
could serve as a "seed" for a complete operating system. 


% ixteen bits and nowhere to go! Being a "hardware" type I naturally had to 
have a 16 bitter as soon as I could manage it. After letting the dust settle a 
bit I decided to go with the 68000 processor from Motorola. The board I 
chose has Motorola's MACSbug monitor in prom, but other than that I had 
no software to run my new friend. 

Several months later, after punching out a cross-assembler in C for chose 
software development, I was ready to start writing code. But what? I have 
always been fascinated by real-time software, and multiple tasking is 
supposed to be one of the things that the 68000 is designed for. So, enter the 
kernel. 


The Kernel 


A kernel is a software device that distributes CPU time among con- 
currently running processes. A process is an individual task, or program, that 
runs asynchronously with other tasks. Since the kernel is continuously 
switching quickly between tasks, it appears to a human observer that that all 
tasks are running simultaneously. Responding to both hardware interrupts and 
software traps, the kernel decides which process gets the next use of the CPU, 
how long it may use it, and in what order other processes must wait. 

System resources other than the CPU may also be shared by creating 
device manager processes for each port, disk drive, etc. The process controls 
the device, carrying out requests for device usage from other processes. It also 
may return either data or status information to the requesting process. 

As an example, a device manager might control each terminal port on a 
system. A user program wishing to do I/O would generate a software trap to 
the kernel, passing a request for either input or output. The data would 
typically be placed into a common buffer before an output request, or gotten 
from the buffer after an input request. Upon receiving such a request from a 
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process the kernel would stop it, saving all necessary registers in RAM. It 
would then start the device managing task, passing the request to it. This task 
would initiate the I/O and give control back to the kernel. The kernel would 
then run some arbitrary task until receiving an interrupt from the hardware 
signalling that the I/O is complete. It then awakens the manager, which 
passes control back to the requesting process, along with a result message of 
the I/O process. 

Although this seems quite complicated on the surface, it allows more 
efficient use of the CPU, since it is not wasting time while waiting on the 
relatively slow speed of port hardware. It also creates a level of system 
protection because the device manager may refuse to grant requests based on 
permission algorithms, reservations, etc. 


The 68000 


The Motorola 68000 offers all the hardware features necessary for efficient 
implementation of such a kernel. It allows a total of 192 user-generated 
interrupts, either software or hardware generated. Each such interrupt has a 
corresponding 4-byte vector in the first page of memory that holds a jump 
address to a service routine. It also has two operating modes—superviser and 
user. When in user mode certain instructions may not be used, allowing direct 
hardware protection of sensitive instructions. Furthermore the hardware can 
address memory so that no process running in user mode can even access 
memory set aside for the superviser state/code. 

The assembler syntax used in the listing is fairly straightforward, 
consisting of mnemonics and pseudo-ops defined by Motorola. Comments are 
preceeded by an asterisk (*) on comment-only lines, or may follow the 
operand fields of statements without the asterisk. The ".b", ".w", and ".1" 
extensions of mnemonics specify byte, word and long word data sizes. The 
pound sign (#) marks an immediate value. 


The Code 


The code as presented is far from a complete system. It is a workable 
starting point, demonstrating the features of a multitasking kernel. First I 
will explain what it does as written, then I'll go into various ways it might 
be modified for particular uses. 

The first set of equates defines the memory/port map. It should be noted 
that the 68000 uses memory mapped I/O, thus allowing any instruction that 
manipulates memory to also access hardware. The next section, commented 
out, describes the exception (hardware and software interrupt) vector table used 
by the 68000. When implementing this kernel you would set whichever of 
these vectors the application required. 

Beyond this is the beginning of the kernel data area (800hex). A maximum 
of 8 concurrent processes are allowed, with 3 being set up at the start. 
Following this is a process descriptor definition. It contains a pointer to the 
next process descriptor, a priority flag and space to store the entire register set 
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of the CPU. The final lines set aside additional descriptor space for each of the 
possible tasks. 

A series of pointers is then allocated. Running is a pointer to the process 
descripter of the currently running task. Ready is a pointer to the head of the 
queue of tasks waiting for CPU time. Dead is a pointer to a queue of unused 
process descripters. Conditions is an array of pointers, one for each process, 
where the process descriptor's address is stored when it is "sleeping"—i.e., 
when it is waiting to be restarted by another task. Finally, device_] and 
device_2 point to queues of tasks waiting on those devices. 

The first part of the kernel is its initialization entry point. It sets up its 
stack, initializes the needed exception vectors and queues the processes into 
the running and ready slots. It then goes on to initialize three very simple 
tasks for demonstration purposes. This includes setting a program counter, 
stack pointer, and status register value for each. Remaining unused descripters 
are put into the dead queue for future use. Finally, the stack pointer and status 
register values for the first task are pushed onto the stack. It is then started 
with an rte instruction, the standard method of causing the 68000 to return to 
a task that generated an exception. The task will then run until an exception 
(interrupt or trap) occurs. 

Notice the conditional line of code for the 68010. This line was assembled 
because MC68010 was set to 1 by a previous equ directive. It is necessary 
when using the 68010 because this version of the CPU places one or more 
additional words of information on the stack during an exception cycle than 
does the 68000. This fourth word contains a 4-bit field, called the format 
field, that indicates whether additional words were stored on the stack during 
the beginning of the exception cycle. In this case the 0 value placed on the 
stack tells the 68010 that no more words relevant to this exception are on the 
stack. When using a 68000 be sure that you change the equ directive to 0, to 
prevent assembly of this line. 


The Kernel Body 


The body of the kernel has three entry points. The major entry point is 
slice. It is accessed via the exception vector used by hardware clock interrupts, 
autovector #1. The wait entry point is called (with a "trap #10") by a task 
wishing to sleep while awaiting some occurance within the system. The 
condition (i.e., the condition slot) it wishes to wait on is passed in data 
register 7. The corresponding signal entry point ("trap #11") is used by 
another task to cause awakening of a task. The particular condition it wishes 
to signal is passed in data register 7. The slice_trap entry ("trap #12") is used 
by a process to give up the CPU to the next waiting process. 

Other routines should only be accessed from within the kernel. 
Store_runner and restore_runner save and restore the machine state when 
Switching tasks. Switch is used by the kernel to place the running task into 
the ready queue and retrieve a waiting task. 
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The Tasks 


The tasks shown in the listing, written for purposes of demonstration, do 
nothing more than give a visual display of their existence. Task one prints a 
series of ASCII Is, then waits on condition 1 to become true. Task two 
prints a series of 2s then waits on condition 2. Task three prints a series of 
3s, signals task two, then signals task one. This whole process continues 
until a "clock" interrupt occurs. This clocking causes the current task to 
suspend and the next task to execute. 

The device managers for the ports described above are not coded in the 
listing, as these will differ greatly from one machine to the next. 


Modifying for the Real World 

Obviously the first changes to make would be to remove the fake tasks and 
write useful replacements. 

If the kernel were to be used for real-time control of specific hardware 
devices, individual tasks would be created that manage each device. This 
would include handshaking via interrupts and assignment of relative priorities 
with the priority word in each descripter. 

If used as the basis of an operating system, a user interface task would be 
needed, as well as device managers for each port. Processes for a file system 
and a media storage manager would be needed. The operating system written 
in Small C [see Dr. Dobb’s Journal, March 1983] would be an interesting 
starting point. 

It would be possible to save task switching time by saving only those 
registers known to be used by user processes. 

The process descriptor allocation could be made completely dynamic if it 
were to be kept just above each task's stack space. This scheme would only 
work with simpler hardware that didn't map memory according to the current 
operating mode, since the descriptor is accessed from supervisor mode, while 
the descriptor would now reside in user memory. 

This software is free of restrictions for noncommercial use and dis- 
tribution. Commercial use without author's permission is forbidden. The 
kernel was based on concepts outlined in Structured Concurrent Programming 
With Operating System Applications by R. C. Holt, et al. 
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Listing 13.1 


* 
* 
* 
x 
* 
* 
* 
* 


mc68010 
first_page 
kernel 
kernel_data 
user_sr_mask 


auto_v_int 
trap_0 


port 
int_1_ ack 


a ee a Se ae ees 


+ + OF 


* 


325 


A Kernel for the MC68000 processor 


aa 
a a 


Q. 
Q 
PRP RP PRP PP ee 


Q 
ro) 


(C) Copyright 1983 Steve Passe, 


modified for 68010 12/23/83 smp 


equ 1 

equ $0 

equ $800 
equ $7000 
equ 0 

equ $64 

equ $80 

equ $03ff01 
equ $03f£f£30 


org 


error vector 
error_vector 
error_vector 
error vector 
error vector 
error vector 
error vector 
error vector 
error vector 
error vector 


vectors 12 thru 23 reserved by 
really safe I suppose you should init them... 


org 


error_vector 
error vector 
error _vector 
error vector 
error vector 
error vector 
error vector 
error vector 
error vector 
error vector 
error_vector 
error_vector 
error vector 
error vector 
error_vector 


24: 
25% 
26: 
27: 
28: 
29% 
30: 
Sihis 
32:3 
332 
34: 
855 
36: 
30 
38: 


all rights reserved 


first_page+8 


FI KR IK I KKK IKK IO FIR IOI OR IO IO RIK OK IO IO IO IOI ORO TOK OO KK KR KR 


O: 68000, 1: 68010 
exception vectors 

init entry to kernel 
user scratch memory 

no trace/user/lowest int. 


of autovector #1 
of trap #0 vector 


address 
address 


acia #1 
level 1 


status 


(clock) int ack 


map of exception vectors, not used when running under MACSbug 


first 8 cast in silicon 


bus error 
address error 
illegal instruction 
divide by zero 

out of bounds 
overflow 


privilege violation 
trace routine 
1010 psuedo code 
1111 pseudo code 


Motorola, 


first_page+$5c 


to be 


address of vector #24 


spurious interrupt 


#1 autovector, 


system clock 


#2 autovector, winchester 


level 3 int. 
level 4 int. 
#5 autovector, 


autovector 
autovector 
serial ports 


#6 autovector, parallel port 


#7 autovector, 


trap 
trap 
trap 
trap 
trap 
trap 
trap 


#0 
#1 
#2 
#3 
#4 
#5 
#6 


abort switch 
vector 
vector 
vector 
vector 
vector 
vector 
vector 


+ * FOF 
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id de‘. error vector 39: trap #7 vector 
* dc.1 error vector 40: trap #8 vector 
= died error _vector 41: trap #9 vector 
* de.1 wait 42: #10, kernel wait entry 
* de... signal 43: #11, kernel signal entry 
= de.1 slice trap 44: #12, kernel slice entry 
~ dc.1 error_vector 45: trap #13 vector 
* de.l error vector 46: trap #14 vector 
* 
¥ comment out trap #15 init to retain MACSbug console i/o trap... 
* dc.1 error_vector 47: trap #15 vector 
* 
* 
% vectors 48 thru 63 reserved by Motorola, to be 
* really safe I suppose you should init them also... 
* 
2 gonna cheat here, 64-255 are for hardware interrupts that 
~ generate their own vector number. If your hardware can 
x generate such interrupts these must also be initialized. 
* 
org kernel_data 
* 
maximum # of processes in the kernel 
* 
processes equ 8 # of processes supported 
runners equ 3 # of running processes 
* 
* the process descriptors, need one for each possible active process 
* 
process 1 
next_process equ *-process 1 offset to next process 
dé.1 process x points to next process in queue 
priority equ *-process_1 offset to process priority 
dc.w oS priority of this process 
sr_slot equ *-process_1 offset to status reg. 
dc.w 0 value of status register 
pc_slot equ *-process_ 1 offset to prog. cntr. 
ae «1 dummy value of program counter 
doO_slot equ *-process 1 offset to dO copy 
de.l 0 value of dQ for this process 
de.1 0 value of dl 
deb 0 value of d2 
de.1 0 value of d3 
déed 0 value of d4 
de.1 0 value of d5 
desl 0 value of d6 
de.1 0 value of d7 
de.1 0 value of a0 for this process 
de.1 0 value of al 
AG 0 value of a2 
deat 0 value of a3 
de.1 0 value of a4 
del 0 value of a5 
a6é_slot equ *-process_1 offset to a6 copy 
deisd 0 value of a6 
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a7_slot equ *-process 1 offset to a7? copy 

de.l dummy_sp value of a7 

ds 0 insure alignment 
process size equ *-process 1 size of a process descriptor 
process 2 ds.b (processes~2)*process_ size less #1 & dummy 
process x ds.b process size dummy process 
process space equ *-process_ 1 


the queue pointers... 


running de.l 0 pointer to running process 
ready ac. L 0 pointer to head of ready queue 
dead de.l 0 pointer to head of dead queue 
Bg 

» conditions (args to wait and signal), only 

ich one process can wait for any on e condition 


* 


in this version... 


conditions ds.1 processes * condition slots, one for 
* each possible process 


= device 1 is acia #1 for this machine 

* 

device_1 ds.1l 1 ptr to head of device _1 queue 
* 

* device 2 is acia #2 

* 

device 2 ds.1l 1 ptr to head of device_2 queue 


KK KK KR KK KR KK OK I IO I TOR IO IO OO OOK IORI IOI IO IO Ok 


* * 
* The initial entry point to the kernel.... x 
* * 
* * 
* * 


RII I IK RI FOR IOI IO IO OO IO IO OI II IO II OR IO FOR IO OK IO FOR FOR IO 


org kernel 


kernel_init 


setup the machine 


move.w #$2700,sr no int, su 
move.1 #$7c00, sp kernel stack 


328 DR. DOBB'S TOOLBOOK OF 68000 PROGRAMMING 


* 


* 


move.1 
move.1 
move.1 
move.1 
move.1 


move.1 
move.1 
move.1 


* 


set the needed vectors 


#slice,auto_v_int 

#disk_port, auto_v_int+(1*4) 
#serial_port,auto_v_int+(4*4) 
#parallel port,auto_v_int+(5*4) 
#abort button, auto_v_int+(6*4) 


#wait,trap_0+(10*4) 
#signal,trap_0+(11*4) 
#slice_trap,trap_0+(12*4) 


* initialize and queue all the processes... 


* 


lea 
move.1 
move.1 
add.1 
move.1 


move.1 


* 

* init 

* 

move .w 

move.1 

lea 
next_byte 

move.b 

dbf 


* 

* init 
* 

move.1 
move .w 
move.w 
move.1 


lea 

move.1 
move.1 
move .w 
move .w 
move.1 


add.1 

move.1 
move.w 
move .w 
move.1 


lea 

move.1 
move.w 
move.1 


process _1,a6 

a6, running 

a6,a5 
#(runners*process size) ,a5 
a5,dead 


#process 2,ready 


the dead for future 


# ( (processes-1) *process_size)-1,d7 
a6,a4 
process 2,a5 


(a4)+, (a5) + 
d7,next_byte 


first and last process reg values 


#task_1,pc_slot (a6) 
#1,priority (a6) 
#user_sr_mask,sr_slot (a6) 
#task_1_sp,a7_slot (a6) 


process 2,a5 

#process 2+process_size, (a5) 
#task_2,pc_slot (a5) 

#1, priority (a5) 
#user_sr_mask,sr_slot (a5) 
#task_2_sp,a7_slot (a5) 


#process_ size,a5 
#task_3,pc_slot (a5) 
#1,priority (a5) 
#user_sr_mask,sr_slot (a5) 
#task_3_sp,a7_slot (a5) 


process x,a5 
#dummy,pc_slot (a5) 
#user_sr_mask,sr_slot (a5) 
#dummy_sp,a7_slot (a5) 


hardware clock, vl 
disk interrupts, v2 
acia interrupts, v5 
parallel port int, v6 
abort switch, v7 


kernel wait entry, trap #10 
kernel signal entry, #11 
(software) process slice, #12 


a6 holds ptr to running 
running points to p 1 
a5 points @ process_1 

x runners 

dead points @ process_? 


ready points @ p 2 


bytes, less dbf 
a4 points at base 
a5 points @ process 2 


stuff a byte 
another byte 


setup program counter 
setup priority 

setup status register 
setup stack 


a5 holds desrc. base 

point to next process 
setup program counter 
setup priority 

setup status register 
setup stack 


a5 holds desrc. base 
setup program counter 
setup priority 

setup status register 
setup stack 


a5 holds dummy process 
setup program counter 
setup status register 
setup stack 


move.1 dead, a5 

moveg #processes- (runners+1+1),d7 
next_dead 

add.1 #process size,a5 

move.1 a5, -process_size (a5) 

dbf d7,next_dead 

* 

* go for it! 
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* link rest of processes into dead 


* 


* 


if mc68010 
move .w #0,-(sp) 
endif 
move.1 #task_1_sp,a0 
move.1l a0,usp 
move.1 #task_1,-(sp) 
move .w #0,-(sp) 
rte 


for the 68010 exception stack frame 


dead pointer 
minus runners/dummy/dbf 


point to next process 
into last prces.next 
continue 


formatcode/exception # 


task 1 sp into a0... 
...then into user sp 
setup rte, task 1 addr. 
and status reg. value 


FORO IR IR OK II IKK IORI IO IO RI RK IOI IO I I II IORI IO IOI OK IR RK ROR KK KK 


x 


* 


* 


+ + OF 


the body of the kernel code... 


reg d7 is used to pass condition # 
reg a6 is reserved by kernel to keep pointer to running 


entry points: 


1: wait, trap #10 


2: signal, 
3: slice_trap, 


kernel_body 


store_runner 


trap #11 
trap #12 


* 


+ OF 
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* save the running process in its descriptor 

* 
move.1l a6,-(sp) push a6 onto stack 
move.1 running, a6 current runner into a6 


movem.1 d0-d7/a0-a5,d0_slot (a6) 


move.1 (sp)+,a6_slot (a6) 
move.w 4(sp),sr_slot (a6) 
move.1 6(sp),pc_slot (a6) 
move.1 (sp) ,6(sp) 

addq.1 #6,Sp 

btst #5,sr_slot (a6) 
bne.s is_super 

move.1 usp, a4 

move.l a4,a7_slot (a6) 


rts 


store the data & addr. regs. 
pop saved a6 

store status register 

store the program counter 
put return over sr/pc 

setup return 

were we supervisor before? 
yes, don't need usp... 

no, get users stack pointer 
and place in descriptor 
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is_super 
move.1 
addq.1 
rts 


wait 
or 
bsr.s 
mulu 
lea 
move.1 
run 
move.1l 
move.1 
bsr.s 
rte 


signal 
or 
mulu 
add.1 
exg 
tstwl 
exg 
bne.s 
rte 


switch 

bsr.s 
move.1l 
lea 
bsr.s 
move.1 
move.1 
elred 
bsr.s 
rte 


slice 
or 
tsti.b 
soft_slice 
bsr 
lea 
bsr.s 
bra.s 


slice trap 
or 


bra.s 


restore runner 


sp,a7_slot (a6) 
#4,a7_slot (a6) 


#$0700,sr 
store runner 
#4,a7 
conditions, a5 
a6,0(a5,d7) 


ready, a6 
(a6), ready 
restore_runner 


#$0700,sr 
#4,07 
#conditions,d7 
d7,a6 

(a6) 

a7,a6 

switch 


store_runner 
d7,-(sp) 
ready, a4 
link_queue 
(sp) +,a5 
(a5),a6 

(a5) 

restore runner 


#$0700,sr 
int_1l_ack 


store runner 
ready, a4 
link_queue 
run 


#$0700,sr 
soft_slice 


present ssp value 
value before this subroutine 


block interrupts 

save this process 

# by pointer size 

base of conditions 

store process in condition x 


get ready from queue 
new head of ready queue 
setup next process 


block interrupts 

make offset to condition 
add base of conditions 
setup indirection on a6 
test for null 

restore prior order 

if waiting go for it 


save runner 


start at head of ready queue 
put into ready queue 

setup indirection again 

get the waiter 

not waiting anymore 

restore new runner 


block interrupts 
clear interrupt 


save runners registers 

start at head of ready queue 
link runner into ready queue 
go for ready 


entry to slice from software 
block interrupts 


ie restore runner from descriptor (in a6 when called) 


a6, running 
(sp) +,a3 
a7_slot (a6),a4 
#5,sr_slot (a6) 
was_ super 
a4,usp 


record new runner 

save return, clear stack 
stack to restore 

were we supervisor before? 
yes, don't need usp 
...then into usp 
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bra.s sp_set 


was_ super 


sp_set 
move.l1  pc_slot (a6),-(sp) 
move .w sr_slot (a6),-(sp) 
move.1 a3,-(sp) 
movem.1 d0_slot (a6) ,d0-d7/a0-a6é 
rts 
link_queue - 
* 
move.1 a4,a3 
move.1 (a4) ,a4 
move.w priority (a4),d7 
cmp .w priority (a6),d7 
bls.s link_queue 
move.1 a6, (a3) 
move.1 a4, (a6) 
rts 
* 
* the first process 
* 
task_1 
move.1 #80,d1 
€.1 move.1 #port,a0 
move .b #'1',d0O 
jsr put_char 
moveq #1,d0 
jsr sleep tenth 
dbf dl,t_1 
moveq #1,da7 
trap #10 
bra.s task_1 
ds 128 
ds 0 
task_1_sp 
* 
* the second process 
* 
task_2 
move.1 #80,d1 
€ 2 move.1 #port, a0 


task_2_ sp 


move.1 a4,sp 


we hope! 
ssp = old ssp 


restore the program counter 
restore status register 
restore return 

restore data & addr. regs. 


link a process into the queue pointer to by a4 


move.b #'2', dad 


jsr put_char 
moveq #1,d0 

jsr sleep_tenth 
dbf dl,t_2 
moveq #2,a7 

trap #10 

bra.s task_2 

ds 128 

ds 0 


remember last 

next link into head 

get priority into dat reg. 
compare running and head 
keep trying 

last.next == running 

(former) running.next == link 


acia #1 

why? 

send it 

one tenth sec 
kill time 


wait on condition one 
forever 


enough? 
insure alignment 


acia #1 
why not! 
send it 


kill time 
wait on condition two 
forever 


i hope so 
insure alignment 
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the third process 


task_3 
move.1 #80,da1 
t_3 move.l1  #port,a0 acia #1 
move.b #'3',d0d 
jsr put_char send it 
moveq #1,a0 
jsr sleep_tenth kill time 
dbf alyt. 3 
moveq #2,a7 signal two 
trap #11 
moveq #1,a7 signal one 
trap #11 
bra.s task_3 forever 
ds 128 
ds 0 insure alignment 
task_3_ sp 
* 
* the dummy process 
* 
dummy 
move.b #'!', dd why for? 
move.1 #port, a0 acia #1 
jsr put_char send it 
bra.s dummy forever 
ds 128 
ds 0 insure alignment 
dummy sp 


* 


* sleep one tenth second 
= 


sleep tenth 


tenth_constant equ 50000 

mulu #tenth_constant,d0 multiply by constant 
st dbf do0,s_t 

rts 
serial port * 


process serial port interrupts 


rte 
parallel port sel 
* process parallel port interrupts 
rte 
abort_button - 


process the abort button 


rte 


error_vector 


rte 
disk_port 
rte 
* 
* 
* 
* 
* 
* 
put_char 
move.1 
ps 
move.b 
and.b 
beq.s 
move .b 
move.1 
rts 
symbol - 
a6_slot 
a7_slot 


abort_button 
auto_v_int 
conditions 
do_slot 

dead 
device _1 
device_2 
disk_port 
dummy 
dummy_sp 
error_vector 
first_page 
int_1l_ack 
is_super 
kernel 
kernel_body 
kernel_data 
kernel_init 
link_queue 
next_byte 
next_dead 
next_process 
pic 

parallel port 
pc_slot 

port 
priority 
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ad: 
do: 


dl,-(sp) 


(a0) ,dl 
#2,d1 
Pic 

dO, 2 (a0) 
(sp) +,dl 


process misc. vectors called incorrectly... 


process disk interrupts 


address of port 
byte to output 


end kernel 


hex value - 


44 
48 
e8e 


save dl 


load st 


atus word 


check tbuffer empty 
not empty yet... 


ready, 


ship it put 


restore dl 


decimal - atrb. 
68 0x01 

72 0x0 
3726 0x01 
100 0x0 
29292 0x0 
12 0x01 
29288 0x0 
29324 0x0 
29328 0x0 
3730 0x0 
3440 0x0 
3712 0x01 
3728 0x01 
0 0x01 
261936 0x0 
2386 0x0 
2048 0x0 
2336 0x01 
28672 0x0 
2048 0x0 
2540 0x01 
2160 0x01 
2298 0x0 

0 0x0 

3732 0x0 
3724 0x0 
8 0x01 
261889 0x0 
4 0x01 


routine to output a char to port addressed by register a0 
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process 1 7000 28672 0x01 
process 2 704c 28748 0x01 
process size 4c 76 0x01 
process space 260 608 0x01 
process x 7214 29204 0x01 
processes 8 8 0x01 
put_char e94 3730 0x01 
ready 7264 29284 0x01 
restore_runner 9c2 2498 0x01 
run 96e 2414 0x01 
runners 3 3 0x01 
running 7260 29280 0x01 
st e84 3716 0x01 
serial port e8a B722 0x01 
signal 97a 2426 0x01 
sleep_tenth e80 3712 0x01 
slice 9a6 2470 0x01 
slice_trap 9be 2492 0x01 
soft_slice 9b0 2480 0x01 
sp_set 9da 2522 0x01 
sr_slot 6 6 0x01 
store_runner 920 2336 0x01 
switch 992 2450 0x01 
ae | a6 2566 0x01 
t. 2 b2a 2858 0x01 
2 c4e 3150 0x01 
task_1 add 2560 0x01 
task_1_ sp b24 2852 0x01 
task_2 b24 2852 0x01 
task_2_sp c48 3144 0x01 
task_3 c48 3144 0x01 
task_3_sp d70 3440 0x01 
tenth_constant 350 50000 0x01 
trap_0 80 128 0x01 
user_sr_mask 0 0 0x01 
wait 95¢ 2396 0x01 


was_ super 9a8 2520 0x01 


A Pseudo 
Random-Number 
Generator 


Michael P. McLaughlin 


The pseudo random-number generator routine presented 
below uses an algorithm commonly known as "GGUBS." 
This is a good algorithm with a very long period. 


Listing 14.1 


#PSEUDO-RANDOM NUMBER GENERATOR -- (USES D2-D7) 

#GIVEN ANY SEED (1 TO 2**31-2) IN D7 (LONGWORD), THIS GENERATOR YIELDS A 
*NON-REPEATING SEQUENCE (RAND(I)) USING ALL INTEGERS IN THE RANGE 1 TO 
#2**31-2. THE AVERAGE EXECUTION TIME IS 240 MICROSECONDS (AT 8 MHz). THIS 
*GENERATOR, REFERRED TO IN THE LITERATURE AS "GGUBS", IS KNOWN TO POSSESS 
#GOOD STATISTICS. THE ALGORITHM IS: 


; RAND(I+1) = (16807*RAND(I)) MOD (2**31=1) 
#WHEN PROPERLY CODED, THIS ALGORITHM WILL TRANSFORM RAND (0) = 1 INTO 
#RAND(1000) = 522329230. THE FOLLOWING IMPLEMENTATION USES SYNTHETIC 


*DIVISION, VIZ., 


; Kl = RAND(I) DIV 127773 


; RAND (I+1) = 16807* (RAND (I) -K1*127773) -K1*2836 
3 IF RAND(I+1)<0O THEN RAND(I+1) = RAND (I+1) +2147483647 
; REFERENCE: 
; BRATLEY, P., FOX, B.L. and L.E. SCHRAGE, 1983. 
; A GUIDE TO SIMULATION. SPRINGER-VERLAG. 
RANDOM MOVE.L D7,D6 ;copy RAND (I) 
MOVE.L #127773,D2 7Synthetic modulus 
BSR.S DIV #divide D6 by 127773 
MOVE.L D4,D5 7copy Kl 
MULS #-2836,D5 *D5 = -2836*K1 
MULU #42591,D4 mmultiply D4 by 127773 
MOVE.L D4,D6 
ADD.L D4,D4 
ADD.L D6,D4 
SUB.L D4,D7 #D7 = RAND (I) -K1*127773 


MOVEQ #4,D4 7;counter 
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RAN1 


EXIT 


; RAND (I) 


DIV 


DIV1 


DIV2 
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MOVE .L 
LSL.L 
SUB.L 
DBRA 
ADD.L 
BPL.S 
ADD.L 
RTS 


ADD.L 
CLR.L 
MOVEQ 
MOVE 
SWAP 
AND.L 
ADD 
ADD 
ADDX.L 
CMP .L 
BMI.S 
SUB.L 
ADDQ 
DBRA 
RTS 
END 


D7,D6 

#3,D7 

D6,D7 

D4, RAN1 

D5,D7 

EXIT 
#2147483647,D7 


(31 BITS) DIV 127773 (17 BITS) 


D6é,D6 
D4 
#14,D3 
D6,D5 
Dé 
#OFFFFH,D6 
D4,D4 
DS,D5 
D6,D6 
D2,D6 
DIv2 
D2,D6 
#1,D4 
D3,DIV1 


;multiply D7 by 16807 


7D7 = RAND (I+1) 


7normalize negative result 
7D7 = RAND (I+1) 


;shift out unused bit 

7 quotient 

;counter 

7save low word of RAND (I) 


7D6 = RAND(I) DIV 2**15 
zline up quotient 

yand dividend 

;shift in bit of low word 
;trial subtraction 


;real subtraction 
7put 1 in quotient 
;decrement counter and loop 
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Generating 
Nonuniform 
Distributions 

of Random Numbers 


Chris Crawford 


Sometimes it's important to generate random numbers 
whose distribution is nonuniform. This chapter shows how 
to approximate the well-known (but hard to simulate) 
Gaussian distribution of numbers (the "bell curve"). 


ames, simulators and surveys . . .these are examples of programs that 

depend on random-number generators (RNGs). Most RNGs yield uniform 
distributions, which means that the probability of obtaining a particular 
number is the same for all numbers within the possible range. For example, 
when a uniform RNG that returns a value between | and 10 inclusive is used, 
the odds of returning a 1 are the same as the odds of returning a 10. A graph 
of the probability of returning a number as a function of the number itself is 
shown in Figure 15.1. 

There are many algorithms for generating uniform distributions of random 
numbers; the most common is the polynomial counter algorithm, which uses 
a complex formula to generate numbers with no apparent sequence. This 
algorithm is so useful that it was implemented directly in the custom chips of 
all Atari 8-bit computers. 


Favoring the Middle Values 

Although uniform distributions are perfectly adequate for many 
applications, frequently programmers need nonuniform distributions of 
random numbers. A programmer might need to choose randomly from a set of 
values, but require a very low probability of obtaining the extreme values in 
the range. For example, suppose you are writing a program that must 
randomly assign prices to software packages, based on actual market data. 


338 DR. DOBB'S TOOLBOOK OF 68000 PROGRAMMING 


Probability 


Number 


FIGURE 15.1 Uniform distribution of random numbers. 


Although most prices fall in the $50 to $500 range, there are a few packages 
that cost $1,000 or more. You can't simply reject those packages and set an 
arbitrary cutoff point of, say, $600. A uniformly distributed random-number 
generator, however, will choose the very high prices as often as it will choose 
the more typical prices. Using a nonuniform distribution is one way to solve 
the problem. 


Quick and Dirty, but Normal 

The "normal" distribution, a particular kind of nonuniform distribution, 
generates random numbers that fall into the common bell curve pattern. This 
pattern, also called the Gaussian distribution, occurs in nature whenever there 
is a range of values of a parameter. For example, the heights of humans, if 
shown as a bar chart, would form a bell curve. 

There's a way to simulate a bell curve with a uniform RNG. The 
algorithm I suggest can be presented in the form of the following 68000 code: 


MOVEQ #0,da2 initialize sum 

MOVE .W CURVEPARM, dl initialize loop counter 
LOOP JSR RANDOM get random # into dO 

ADD .W do, d2 add random # into sum 

SUBQ #1,dl1 decrement loop counter 

BNE LOOP loop logic 

DIVS a2, CURVEPARM normalize sum 


This code assumes that the subroutine RANDOM preserves registers d1 
and d2, and returns a uniformly distributed random number in register dO. The 
critical parameter for this code is CURVEPARM, a variable that determines 


GENERATING NONUNIFORM DISTRIBUTIONS OF RANDOM NUMBERS = 339 


the shape of the distribution. The effect of CURVEPARM can be seen in 
Figure 15.2, which displays graphs of distributions made with various values 
of CURVEPARM. 

As CURVEPARM increases, the distribution becomes smoother, peaks 
higher, and begins to approach the shape of a classic bell curve. Note that 
higher values of CURVEPARM require longer execution times. This 
algorithm doesn't actually generate data with a true Gaussian distribution, but 
for many applications, it's possible to get sufficiently close to it without 
using up a lot of CPU time. 


CURVEPARM=2 CURVEPARM=3 CURVEPARM=4 


Probability 
Probability 
Probability 


Number Number Number 


FIGURE 15.2 Graphs with normal distributions. 
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The Worm 
Memory Test 


Jan Steinman 


The Worm is a memory test that has the unusual 
characteristic of being able to overlay itself 
while it’s running. It makes special use of 

the 68000’s instruction prefetch register. 


o, the Worm Memory Test is not a method for quantifying the mental 
N retentive powers of long, cylindrical invertebrates. It is a test that could 
help diagnose certain types of computer memory errors. Worm (see listing) 
uses a dynamically executing program as the actual test data. Unlike previous 
memory test programs of this type, this one has a special twist: It can 
overlay itself while it is executing, thanks to the MC68000's prefetch 
register. 


Some Fetching Facts 

Never heard of the prefetch register? To understand how the memory test 
works, it might help to review the way the MC68000 fetches and executes 
instructions. The MC68000 uses instruction pipelining in order to speed 
execution. There is, in effect, a 16-bit register between the data bus and the 
instruction decoding logic. (The MC68010 has 32 bits of prefetch and the 
68020 has a 64-entry instruction cache, but the results should be similar.) 
When an instruction is executed, the opcode for that instruction is first loaded 
into the prefetch register (often while the previously fetched instruction is 
being executed), then the instruction is moved into the instruction decoding 
register, where it is executed. The net effect is that the processor usually has a 
handle on the next thing it is supposed to do. 

Prefetch works fine most of the time, but it does slow things down during 
certain operations. If the instruction being executed causes a nonsequential 
instruction to be executed, execution may be either faster or slower. In the 
case of a conditional branch instruction, a branch taken is quite fast because 
the prefetch register already holds the displacement that must be added to the 
program counter in order to fetch the next nonsequential instruction. A branch 
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not taken, however, will be a little faster if it is a short branch, because the 
next instruction is already in the prefetch register and the two clocks needed to 
add a displacement to the program counter can be saved. The worst case 
happens when a branch is not taken and the branch displacement is 16 bits. In 
that case, the processor has useless information in the prefetch register and 
must flush that information before it can fetch the next instruction. 

Other nonsequential instructions cause an immediate flush of the prefetch 
register and use an extra four clocks simply to restart the pipeline. One 
exception is the decrement-and-branch instruction, which, like the taken 
branches, benefits from having the branch displacement handy. (The 
MC68010, with its 32-bit prefetch register, actually executes many 16-bit 
instructions out of the prefetch register if they precede a decrement-and-branch 
instruction.) 


How the Worm Crawls 

Worm depends on these characteristics of pipelining in order to overlay 
itself while it is running, but it needs some management and control in order 
to be useful—a Worm on the loose would quickly destroy all memory! 
Besides Worm, a complete memory test requires two additional parts: an 
initialization sequence and a routine for controlling Worm and reporting its 
findings. 

The initialization routine, /nit, has some special characteristics and 
includes most of the system dependencies. It is executed only once—at the 
beginning—and is therefore throwaway code. That is why it is placed last; 
Worm actually crawls right over its initialization code in this imple- 
mentation. The registers are set up to the specifications of Worm and several 
important system functions are performed. In particular, it is important that 
page faulting does not occur in systems that support virtual memory; if 
special hocus-pocus is needed to turn off interrupts, it should be done here. 

Manager exercises control over Worm and is responsible for com- 
municating errors it discovers and for displaying progress messages if desired. 
When Manager is entered upon completion of a Worm pass, it must decide if 
it has been entered because of an error or simply as a point of control. If there 
has been an error, Worm is no longer runnable, so Manager will have to 
report the error and terminate. If no error is detected, Manager must check the 
progress of Worm to keep it from consuming all memory. At this point, 
Manager can decide that enough memory has been checked to warrant a 
progress report of some kind. 

The real heart of the whole thing is, after all, Worm. Worm simply 
replicates itself, one longword lower in memory, while comparing the new 
copy of itself against the original, which never executes. Worm may be the 
heart of the memory test, but the three instructions starting at Crawl are 
where the magic happens. This loop starts at the beginning of Worm, and 
copies the first longword down to Worm-4. It continues with each additional 
longword, until it gets to the longword at Craw/l+4, which is a dbne 
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instruction with its 16-bit displacement. The preceding move.| and cmp.1 
have already been copied down. 

At this point, it becomes a little difficult to keep track of what is data and 
what is code. When the move.| is in the instruction decode register, ready to 
be executed, the following cmp.1 is in the prefetch register, waiting its turn to 
be executed. When the move.] at Craw/ executes, it moves the dbne 
instruction into the location it and the following cmp.1 are currently 
occupying. The processor has no way of knowing it has just invalidated its 
prefetch register, so it continues—moving the cmp.1 instruction into the 
instruction decode register and moving the following dbne into the prefetch 
register. The cmp.l executes, comparing the dbne just moved with the 
original while moving the branch displacement for the dbne into the prefetch 
register. 

Assuming the compare was successful, the dbne executes, decrementing dO 
and branching backward 4 bytes to where the move.1 used to be. The prefetch 
register is flushed because of the branch, so the value at that location is loaded 
into the prefetch register and immediately into the instruction decode register. 
But what is loaded? A copy of the dbne, complete with the same negative 
displacement value. The condition codes have not changed, and the count 
register dO should not be anywhere near 0, so the copy of the dbne gets 
executed identically to its predecessor, which still resides in the next 
longword. The dbne copy branches to the move.| copy, and the loop 
continues moving the code down 4 bytes. (See table.) 


Before After 
Crawl-4 tis move.1  (a0)+,(a1) 
Crawl-2 sti. A cmp.1 (a1)+,(a2)+ ) 
Crawl move.1  (a0)+,(a1) dbne do,-6 
Crawl+2 G cmp.1 (a1)+,(a2)+ 
Crawi+4 Gdbne  d0,-6 


TABLE 16.1 The test in action. 


When the count register dO underflows, the dbne copy drops through, 
interrupts are enabled, Worm’s dynamic image pointer a5 is adjusted to point 
to the new Worm copy, and the Worm reports back to Manager. Note that 
none of the Worm code is ever executed before it has been compared and 
verified. 

It is vitally important to disable interrupts when the move.] overlays itself 
and the following cmp.1. An interrupt at this point causes the prefetch to be 
flushed when the interrupt is serviced. Upon return from the interrupt, the 
displacement part of the dbne (hex FFFA) will be fetched as an instruction. 
This will cause a "line 1111 emulator exception" message unless your system 
has a coprocessor with an ID code of 7, but either way Worm will be broken 
and the memory test will fail. And of course, it is important that the length 
of Worm remains a multiple of 4 if you decide to modify it! 
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But What Good Is It? 

I originally developed the MC68000 Worm Memory Test for an embedded 
processor application that was having dynamic-RAM refresh problems. It was 
discovered that conventional RAM tests, which move smoothly up through 
consecutive addresses, were masking the problem by unintentionally 
providing software refresh. The test is not long enough to cause a complete 
cycle of all a dynamic RAM's row-address-strobe (RAS) lines and was able to 
help diagnose the problem. 

In the form presented, this implementation is useful primarily as an 
illustrative example of position-independent coding, modular design and, of 
course, a unique use of the prefetch register. It could be put to practical use in 
several ways. 

The best use of the memory test might be to have it running continuously 
as a very low priority task. Manager would have to take some of the 
responsibility of /nit by allocating test memory and restarting Worm when it 
finished testing a buffer. The interrupt disabling code may be simpler on 
systems without virtual memory (on the Commodore Amiga, for example, it 
is a simple memory store). 

Virtual-memory systems would also need to add code to branch around the 
interrupt disabling code on the copy of the first longword only, which would 
allow the memory test to generate page faults whenever it first crosses a page 
boundary. To make it practical in such systems, Manager would have to 
access the memory-management hardware in order to map faulty virtual 
locations to broken chips. 

The Worm routine itself can hold much more code if desired. I originally 
had much of Manager's decision code in Worm, which did speed it up but at 
the expense of simplicity. In a message-based system, such as the Amiga, 
Manager could be totally deleted. Worm could contain all the task code, 
merrily crawling through any available RAM it could find and sending error 
reports through intertask messages—all with minimal impact on the user. 
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Listing 16.1 


ak The Worm Memory Test RRR KKK KR KKK KKK KEK K KK KKK KKK RK KR KK RK KKK 
* Author: Jan W. Steinman, 2002 Parkside Ct., West Linn, OR 97068. 


x 


* The Worm memory test has three parts. Init sets up the registers for the 

* Worm. The Display Manager interacts with the Worm each pass and periodically 
* Displays the Worm's progress. The Worm itself Worms itself through memory, 

* from high to low, checking memory against a copy of itself. The Droppings 

* form a pattern through memory when the test is complete. 


* This version runs on the Tektronix 4404 under Uniflex. System dependent code 

* is mostly segregated to the Init, Display, Disable and Enable routines. Two 

* instructions in the Worm routine are system dependent, for enabling and 
disabling interrupts. 


* 
* 
* Register usage: 
* 
* 


DO scratch register. 
D1 scratch register. 
* D2 scratch register. 
* D3 scratch register. 
* D4 
x DS address mask for determining if time to show progress. 
* Dé base of memory area under test. 
* D7 length of Worm in long words. 
* AO scratch register. 
x Al scratch register. 
* A2 scratch register. 
* A3 pointer to Display manager for position independent access. 
* A4 pointer to permanent Worm image for comparison. 
* A5 pointer to crawling Worm image. 
* A6 
* Al stack pointer. 


* 
* These included files contain system definitions and interrupt (signal) 
* numbers for the Uniflex operating system. Don't bother to list these. 


OPT lis 
DEFINE (This makes all labels global for debug.) 


x 


* Set D_MASK with the bits that are zero at each progress report. 
* 


D_MASK EQU $00003FC Report each boundary passed. 
REL_SIZ EQU 4 Relocation is four bytes at a time. 
MEM SIZ EQU $2000*REL SIZ Test a 32K chunk. 

DISABLE EQU Zz Trap number for Disable routine. 
ENABLE EQU 3 Trap number for Enable routine. 

CR EQU $OD Carriage return. 

LF EQU SOA Line feed. 


x 


* Uniflex will not allow intersection math, so put all the code in the DATA 
* section, and don't use TEXT or BSS at all! 
* 
DATA Assemble into writable data section. 
MemBeg EQU ™ 
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KKK hexadecimalize KKK KKK KKK KR KKK KKK KKK KKK KK KKK KKK IKK KKK KKK RK KK KR K KKK KKK 
* hexadecimalize converts a long word to eight ASCII hexadecimal characters. 

* This routine is machine and OS independent. It uses a simple table look-up 

* to generate the hexadecimal string. 


x 
* Entry: dO -- Long word to be converted to hex. 
* a0 -- Pointer to buffer where hex characters will go. 
x 
* Exit: d2 -- -1. (Just in case someone cares!) 
* dod -- unchanged. 
= -8(a0) -- points to eight ASCII characters. 
* 
* Uses: d3 -- nybble mask: constant SOF. 
* d2 -- nybble counter. 
® dl -- current nybble to convert is LSN. 
* 
CharTab DC.B '0123456789ABCDEF' Where we keep our hex characters. 
hexadecimalize 
move.l1 #7,d2 Bytes to make - 1. 
move.1 #S0F,d3 Nybble mask. 
HexLoop rol.l #4,d0 Shift the next nybble into the LSN, 
move.1 do,dl make a copy for masking, 
and.1l d3,d1 mask out all but least significant nybble, 
i index into char table and store result. 
move.b CharTab(pc,d1), (a0)+ 
dbra d2,HexLoop Repeat until done, and when done, 
rts hit the road, Jack. 
akKKK Manager KK KR KK KKK KK KKK KR KK RK IKK KKK KKK KKK KKK KKK KK KKK KK KK KKK KK KK KKK 


* Manager checks the Worm's progress, and periodically reports to the Display. 
* This routine is also entered if an error is encountered. 


i Entry: dO -- W_LONGS complement of pass count if error, else -1. 
* al -- test address pass/fail value. 
* 
* Exit: via direct jump to Worm at (A5). 
* 
* Uses: a3, a2, al, dO, a7, al; ad 
* 
* Stack: one level, plus needs of Display. 
* 
ErrMsg DC.B CR, 'Worm reports memory error at ' 
ErrAddrMsg 
DC.B "00000000 on pass ' 
ErrCountMsg 
DC.B *00000000.',CR 
E_SIzZ EQU *-ErrMsg 
DoneMsg DC.B CR, 'Worm tested memory from ' 
DoneBegAddrMsg 
DC.B "00000000 through ' 
DoneEndAddrMsg 
DC.B '00000000 successfully.',CR 
D_SIZ EQU *-DoneMsg 
ProgMsg DC.B *00000000',CR 
P_SIz EQU *-ProgMsg 
EVEN (Stay on legal instruction boundary.) 
Manager tst.w do Was loop exited by error, or countdown? 
bpl.s GetErrMsg Error, go report it. 
cmp.1 a5,d6 Countdown, so are we done yet? 
beq.s GetDoneMsg Yes. Go finish up. 


move.1 a5,da0 No, put the new source where we can 
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and.1l a5, d0 look at the bottom bits: on boundary? 
beq.s Report Yes, set up for progress report. 
jmp (a5) No. Keep on Crawlin'... 
* Finish up. Get the pointer to start addr, 
GetDoneMsg lea DoneBegAddrMsg (pc) , a0 
move.1 al,do and the value to plug in, 
bsr hexadecimalize which gets converted, likewise, get 
lea DoneEndAddrMsg (pc) , a0 
move.l1 #MEM SIZ,d0 the end address and its value, 
bsr hexadecimalize also converted to hexAscii. 
lea DoneMsg(pc),a0 Get pointer to complete done message, 
move.l1 #D_S1Z,d3 length of the done message, 
pea Exit (pc) push a return pointer, 
bra.s Display and go display the message. 
bs Make an error report. Get message ptr, 
GetErrMsg lea ErrCountMsg (pc) ,a0 
sub.b #W_LONGS-1, d0 convert worm count to a pass count, 
bsr hexadecimalize make it hex for Display. 
* Get addr of ASCII error addr, 
lea ErrAddrMsg (pc) , a0 
move.1 #-4,d0 get bad long addr to display, 
add.1 al,do less four to account for postincrement, 
bsr hexadecimalize make it hex for Display. 
lea ErrMsg (pc) , a0 Get pointer to whole err msg, 
move.l1 #E_SIZ,d3 the size for the write, 
pea Exit (pc) push a return pointer, 
bra.s Display and Display the message. 
. Progress report. Get message ptr, 
Report lea ProgMsg (pc) , a0 
move .1 a5,d0 load the checked address, 
bsr hexadecimalize make it hex for Display. 
sub.1 #8,a0 Regain pointer to the message, 
move.1 #P_S1Z,d3 get the size for the write, 
pea (a5) push a return ptr to the new Worm, 
1 and drop through into Display. 
kkk Display KKK KR KR KKK KKK KKK KR RK IKK KKK OK KK ROR OK OOK IO RTO RO OK tk 


* Display is an implementation-dependent scheme for reporting the Worm's 
* progress. Upon entry, AO contains a pointer to a string to Display, and D3 
* contains the length of the string to Display. 


x 


i Entry: d3 -- number of bytes to display. 

* a0 -- address of a string to display. 

* 

- Uses: dOQ -- file descriptor of stdout. 

x al -- scratch register for pointing to SysCall param block. 
* 

isi Stack: as needed by system call. 

* 


RAREERER BEGIN SYSTEM-DEPENDENT CODE SEE ITE 


Display move.1l d3,-(a7) Load the byte count, 
move.1 a0,-(a7) the actual string pointer, 
move.w #write,-(a7) and the system call index, 
move.1 a7,a0 point to the syscall parameter block, 
move.1 #1,d0 load file descriptor for stdout, 
SYS indx and write the message. 
add.l #10,a7 Remove the params from the stack, and 
rts return somewhere. 


x 


* For lack of a better place to put it, the system- dependent exit code is here. 
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x 


Exit SYS term Terminate this program. (System dependent .) 
wkeeKKKK OE ND SYSTEM-DEPENDENT CODE _ kkkkKKK 


akKKK Disable, Enable KKK KK KKK KKK KK KR IKK KKK KKK KKK KEK KKK KKK KKK KKK KK RK KK 
* These routines provide the exclusion mechanism for the non-interruptible code 

* in Worm at Crawl. These routines must execute in supervisor state, therefore 

* they are executed via the TRAP exception instruction. Enable requires that 

* D1 be preserved from the preceding Disable. 


x 

sal Uses: SR -- interrupt mask is raised and lowered. 

* d2 -- scratch register for restoring original interrupt mask. 
* dl -- scratch register storage place for old interrupt mask. 


* 


eekteeKH BEGIN SYSTEM-DEPENDENT CODE __ *kkkkkex 


Disable move sr,10 Grab the status register, 

and.w #$0300,d1 keep only the interrupt bits, 

and #$0300,sr and disable all interrupts 

SYS cpint, SIGTRAP2, Disable 

rtr before entering critical code region. 
Enable move sr,d2 Regain the status register, 

or.w d1,d2 reset the previous interrupt level, 

move d2,sr and enable the proper interrupts 

SYS cpint, SIGTRAP3, Enable 

rtr before entering critical code region. 


weeeeeee END SYSTEM-DEPENDENT CODE _— *kkkkKKs 


kkKKK Worm KR KKK RK KKK KKK KKK KK KK IK KK ITOK ROKR OOK ROKR RR RO RR Ok kk 
* Worm is a self-modifying, self-relocating procedure which starts at some 


location in high memory and works its way down to its end address, 
periodically reporting its progress. 


+ OF 


The loop at Crawl depends strongly on the 68000 prefetch mechanism. This 

* loop will not work on a 68020 machine (which has a 64 entry cache), nor on 

* most simulators (which often do not bother to simulate prefetch accurately). 
* This loop will also not work with the TRACE bit set, and must be protected 

* from all interrupts, including page faults in virtual memory systems. 


* When this loop moves the DBNE long word at Crawl+4, it overlays the MOVE.L 

* and the CMPM.L at Crawl. The CMPM.L is in the prefetch queue, so it gets 

* executed even though its memory image has just been clobbered. The DBNE is 

* fetched, and its execution flushes the prefetch queue as is the case with all 
* branches. Execution continues with the copy of the DBNE just moved, which 

* executes again, branching to Crawl-4, the new loop location. Note that the 

* loop count gets decremented twice in this scenario, removing the need for the 
* usual predecrement before entering the loop. 


* 

* 

* Entry: d7 -- length of Worm in long words. 

bs dé -- base of memory area to test. 

* d5 -- address mask for display boundary. 

® a5 -- first long word address of Worm at present. 

* a4 -- first long word address of Worm's original image. 
* a3 -- display manager's address. 

* 

* Exit: dO -- W_LONGS complement of pass count if error. 

* a5 -- entry value less relocation, i.e.: next pass entry value. 


x al -- address pass/fail report value. 


THE WORM MEMORY TEST 349 


* 


Uses: dO -- decrementing Worm length. 
am a2 -~ incrementing COMPARE address. 
* al -- incrementing TO address. 
“i a0 -- incrementing FROM address. 
* 
% Unused: d4, d3, a7, a6. 
* 
Worm move.w a7,d0 Restore the Worm's length, 
move.1 a5,a0 its starting point, 
move.1 a4,a2 and its original address. 
lea -4(a5),al Get the destination for this pass. 
PAREN ete BEGIN SYSTEM-DEPENDENT: CODE RE AS 
trap #DISABLE Don't interrupt this critical passage! 
BERKRAER END SYSTEM-DEPENDENT CODE FERRER HH 
Crawl move.1 (a0) +, (al) Move a long word piece of Worm, 
cmp.1 (al) +, (a2) + and check it against the original, 
dbne d0,Crawl one long word at a time. 
RRERR RIOR BEGIN SYSTEM-DEPENDENT CODE RRARE REE 
trap #ENABLE Allow interrupts -- critical section over. 


ARRKKKKK END SYSTEM-DEPENDENT CODE __ *kkkexke 
sub.1 #REL_SIZ,a5 Update the new Worm address, 
nop keep the whole thing on long boundary, 
jmp (a3) report to the Manager. 


* 


The following pattern (which is notoriously hard on 16-bit dynamic RAM 


* memories) gets left in memory and can be checked later if desired. 
* 


Droppings 
DC.L S$S555AAAA Pattern to be left in RAM. 
W_SIZ EQU *-Worm Length of self-relocating code, in bytes 
W_LONGS EQU W_SI2/4 and longs. 
kK Init FRI IORI III II IOI IOI IO IO OR IO IO IO IOI IO IO III ok te 


* Init performs system-dependent initialization and sets up registers for use 
* of Worm and Manager. Init then copies the Worm into the top of test memory 


* and starts the Worm crawling. 
* 


* 


Entry: not applicable. 


* 
* Exit: a5 -- Worm's test image address at top of memory to be tested. 
id a4 -- Worm's permanent image address. 
ie a3 -- Manager routine pointer. 
* a7 -- length of Worm in long words. 
* dé -- base of memory area to test. 
a d5 -- address mask for testing display boundary. 
* 
Ovrly EQU * This area will be overlaid with the worm. 
LogMsg DC.B ‘Worm memory tester, ' 
DC.B "$Header: worm.a-v 1.2 86/03/24 01:44:36 jans Exp $! 
DC.B CR, 'Memory checked down to location:',CR 
L_SIZ EQU *-LogMsg 
EVEN 
GLOBAL Init 
Init 


* 


* First, perform some system-dependent initialization: set up the TRAPs needed 
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* to protect the Worm from interrupts, protect the area to be tested from page 


x faults, and write a welcome message. 
* 


kkkRRKA* BEGIN SYSTEM-DEPENDENT CODE _  *kt#kxKs 


SYS cpint, SIGTRAP2, Disable Set up the exception handlers for the 
SYS cpint, SIGTRAP3, Enable interrupt exclusion routines. 

sys memman, 1,MemBeg, MemEnd Protect memory image from page faults. 
move.l #1,d0 Prepare and write a stdout 

SYS write, LogMsg,L_ SIZ welcome message. 


keeeeeee END SYSTEM-DEPENDENT CODE _ **##x#K# 
* 


* Next, set up registers that will be used by the Worm and Manager. 
* 


move.1 #D_MASK,d5 Get the Display address boundary mask. 


lea Ovrly(pc),a0 Load the lowest address to test 

move.1 a0,d6é into a data register for comparison, 
lea Manager (pc) ,a3 get the Display Manager's address, 

lea Worm(pc) ,a4 the Worm's non-crawling image address, 


move.1 #MemEnd-W_SIZ,a5 and the high-mem Worm start address. 
move.w #W_LONGS,d7 Get the Worm's length in longs. 
* 


* Finally, move the Worm to the top of memory to be tested. 
* 


move.1 a4,a0 Get a copy of Worm's permanent image pointer, 
move.1 a5,al its test image pointer, 
move .w a7,d0 and its length in longs. 
sub.w #1,d00 
MoveWorm move.1 (a0), (al) Move, and compare 
cmp.1 (a0) +, (al) + a long word of the Worm 
dbne d0,MoveWorm at a time. 
tst.w do Exit loop by error, or countdown? 
bpl Manager Error, go Report it. 
jmp (a5) Countdown. Start Crawling! 
C_Ss1z EQU *-MemBeg (Size of non-relocating code.) 


DS.B MEM _ SIZ-C_S1IZ 
MemEnd EQU * 
ENDDEF 
END Init (Set transfer address to the Init.) 


Improved Integer 
Square Root Routine 


Jim Cathey 


Here's an integer square root routine that has been optimized 
for arguments of various sizes. The actual routine is broken 
into three parts: a part for arguments no larger than a single 
word; a part for arguments larger than one word (with two 
of the loops unrolled so that a quick word-oriented loop may 
be used where there is no danger of overflow); and a special 
routine that handles particularly small arguments, used 
when it would be quicker than the normal word routine. 
This program yields correct results over the entire 

range of arguments from 0 to $3FFFFFFFF . 


Listing 17.1 


KKK KK KK KKK KKK KK IKK IK KK IK KK IK KI KKK ITOK KK IOI KE IK KK IIR KR IKK IKK IIR RK IKK KKK KKK 


* * 
* Integer Square Root (32 to 16 bit). ¥ 
* * 
= (Exact method, not approximate). ™ 
* * 
* Call with: * 
* DO.L = Unsigned number. * 
* * 
¥ Returns: * 
* DO.L = SQRT(DO.L) * 
* * 
* Notes: Result fits in DO.W, but is valid in longword. * 
* Takes from 122 to 1272 cycles (including rts). * 
* Averages 610 cycles measured over first 65535 roots. * 
x Averages 1104 cycles measured over first 500000 roots. * 
* * 
KKK KKK KKK KKK KKK KKK KK IKK KKK KKK KK KKK KKK KKK KKK KEKE KKEKKKKKKKK KK KKK KKK KKK KKK KKK 
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-globl lsqrt 


* Cycles 
lsqrt tst.1 do (4) ; Skip doing zero. 
beq.s done (10/8) 


cmp.1 #$10000,d0 (14) 
bhs.s glsqrt (10/8) 
cmp.w #625,d0 (8) 

bhi.s gsqrt (10/8) 


~ 


If is a longword, use the long routine. 


Would the short word routine be quicker? 
No, use general purpose word routine. 
Otherwise fall into special routine. 


Meese 


For speed, we use three exit points. 
This is cheesy, but this is a speed-optimized subroutine! 


+ + £ 


KKK KKK KKK KEKE IKI KEK KKK KK KEK KKK KEK KK KKK KKK KKK KKK KK KKK KKK KKK KK KKK KKK KKK KKK KKK 
Faster Integer Square Root (16 to 8 bit). For small arguments. 
(Exact method, not approximate). 


Call with: 
DO.W = Unsigned number. 


+s *£ + @ F HF S 


Returns: 


* 
* 
*x 
* 
* 
* 
* 
* 
* 
* DO.W 
* 
* 
* 
* 
* 
* 
* 
* 


i] 


SQRT (DO.W) 


* 
* 
* 
Notes: Result fits in DO.B, but is valid in word. * 
Takes from 72 (d0=1) to 504 (d0=625) cycles x 
(including rts). * 
* 

* 

* 

* 


Algorithm supplied by Motorola. 


HK IK II I IIR KKK KK IK IK KK IK KK IK KK IKK KK KK IK KIO KIKI KKK KKK KKK KKK KKK KK 


* Use the theorem that a perfect square is the sum of the first 
* sqrt (arg) number of odd integers. 


* Cycles 
move.w dl,-(sp) (8) 
move.w #-1,dl1 (8) 


qsqrtl addq.w #2,d1 (4) 
sub.w dl,d0 (4) 
bpl qsqrt1 (10/8) 
asr.w #1,dl1 (8) 
move.w d1,d0 (4) 


move.w (sp)+,d1 (12) 
done rts (16) 
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* * 
* Integer Square Root (16 to 8 bit). * 
* * 
* (Exact method, not approximate) . * 
* * 
i Call with: * 
* DO.W = Unsigned number. * 
* * 
* Returns: * 
* DO.L = SQRT(DO.W) ial 
* * 
* Uses: D1-D4 as temporaries -- x 
* D1 = Error term; x 
* D2 = Running estimate; * 
* D3 = High bracket; i 
s D4 = Loop counter. * 
* * 
i Notes: Result fits in DO.B, but is valid in word. 
* * 
* Takes from 512 to 592 cycles (including rts). a] 
* * 
* Instruction times for branch-type instructions * 
* listed as (X/Y) are for (taken/not taken). * 
* * 
KKK KKK KK KK KKK KK KK KKK IK IKK KKK KI KICK KKK KKK KKK KEK KKK KK KKK KKK KKK KKK KKK KKK KK 


# Cycles 

gsqrt movem.w dl-d4,-(sp) (24) 
move.w #7,d4 (8) + Loop count (bits-1 of result). 
clr.w dl (4) + Error term in Dl. 
clr.w d2 (4) 

sqrtl add.w d0,d0 (4) ; Get 2 leading bits a time and add 
addx.w di,dl (4) 7 into Error term for interpolation. 
add.w d0,d0 (4) + (Classical method, easy in binary). 
addx.w dl,dl (4) 
add.w d2,d2 (4) + Running estimate * 2. 
move.w d2,d3 (4) 
add.w d3,d3 (4) 
cmp.w d3,d1 (4) 
bls.s sqrt2 (10/8) ; New Error term > 2* Running estimate? 
addq.w #1,d2 (4) 7; Yes, we want a '1l' bit then. 
addq.w #1,d3 (4) ; Fix up new Error term. 
sub.w d3,dl1 (4) 

sqrt2 dbra d4,sqrtl (10/14) ; Do all 8 bit-pairs. 
move.w d2,d0 (4) 


movem.w (sp)+,d1-d4 (28) 
rts (16) 
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* * 
© Integer Square Root (32 to 16 bit). = 
* * 
* (Exact method, not approximate). - 
* * 
* Call with: * 
x. DO.L = Unsigned number. . 
* * 
* Returns: mi 
* DO.L = SQRT(DO.L) * 
* * 
a Uses: D1-D4 as temporaries -- * 
* D1 = Error term; . 
* D2 = Running estimate; & 
* D3 = High bracket; * 
x D4 = Loop counter. * 
* * 
- Notes: Result fits in DO.W, but is valid in longword. * 
* * 
ag Takes from 1080 to 1236 cycles (including rts.) * 
* * 
* Two of the 16 passes are unrolled from the loop so that * 
= quicker instructions may be used where there is no * 
* danger of overflow (in the early passes). - 
* * 
* Instruction times for branch-type instructions & 
* listed as (X/Y) are for (taken/not taken). = 
* * 
KKKKKKKKEEKKKK KKK KR KR KR KK RR RR RO Rk Rk KR RK Kk kkk kkk 


* Cycles 

glsqrt movem.1 dl-d4,-(sp) (40) 
moveq #13,d4 (4) ; Loop count (bits-1 of result). 
moveq #0,d1 (4) ; Error term in Dl. 
moveq #0,d2 (4) 

lsqrtl add.1 d0,d0 (8) ; Get 2 leading bits a time and add 
addx.w dl,dl (4) ; into Error term for interpolation. 
add.1 d0,d0 (8) ; (Classical method, easy in binary). 
addx.w dl,dl (4) 
add.w d2,d2 (4) 7 Running estimate * 2. 
move.w d2,d3 (4) 
add.w d3,d3 (4) 
cmp.w d3,dl (4) 
bls.s lsqrt2 (10/8) |; New Error term > 2* Running estimate? 
addq.w #1,d2 (4) 7 Yes, we want a '1' bit then. 
addq.w #1,d3 (4) ; Fix up new Error term. 
sub.w d3,dl1 (4) 


lsqrt2 dbra d4,lsqrtl (10/14) ; Do first 14 bit-pairs. 


lsqrt3 


lsqrt4 
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add.1 d0,d0 
addx.w dl,dl 
add.1 d0,d0 
addx.1 d1,dl 
add.w d2,d2 
move.1 d2,d3 
add.w d3,d3 
cmp.1 d3,d1 
bls.s lsqrt3 
addq.w #1,d2 
addq.w #1,d3 
sub.1 d3,dl1 


add.1 d0,d0 
addx.1 dl,dl 
add.1 d0,d0 
addx.1 di,dl 
add.w d2,d2 
move.1l d2,d3 
add.1 d3,d3 
cmp.1 d3,dl1 
bls.s lsqrt4 
addq.w #1,d2 
move.w d2,d0 


(8) 
(4) 
(8) 
(8) 
(4) 
(4) 
(4) 
(6) 
(10/8) 
(4) 
(4) 
(8) 


(8) 
(8) 
(8) 
(8) 
(4) 
(4) 
(8) 
(6) 
(10/78) 
(4) 
(4) 


movem.1 (sp)+,d1l-d4 (44) 


rts 


end 


(16) 


; 


? 


Do 15-th bit-pair. 


Do 16-th bit-pair. 
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A Mandelbrot 
Program 


for the Macintosh 


Howard Katz 


This program explores the famous Mandelbrot set, 
which is said to be the most complex object in 
mathematics. The 68000 source code demonstrates 
many of the techniques used in Macintosh programs. 


P rogrammers looking for new projects can often find inspiration in A. K. 
Dewdney's Computer Recreations column in Scientific American. 
Dewdney frequently discusses interesting and offbeat algorithms and other 
programming matters. His August 1985 column, in particular, seems to have 
touched off something like a feeding frenzy among hackers looking for new 
algorithmic adventures. In that column, Dewdney discusses the Mandelbrot 
set, a mathematical object named in honor of the French mathematician 
Benoit Mandelbrot, of fractal fame. Dewdney also provides several strikingly 
beautiful computer-generated images of the set, which he calls "the most 
complex object in mathematics." Interested readers might refer to Benoit's 
classic volume, The Fractal Geometry of Nature (W. H. Freeman) for other 
fractal creations. 

I will describe here a 68000 program, written using the MDS assembly 
language development system, which produces on a Macintosh screen images 
of the Mandelbrot set. The final application is just over 4000 bytes long. The 
source code, in two sections, is found in Listings 18.1 and 18.2. Listing 
18.1, at just over 600 lines, contains the main body of the program. Listing 
18.2 is the assembler source for a string-to-fixed-point number conversion 
routine, which is assembled separately and then linked with the REL file 
produced by Listing 18.1. 

The algorithm described by Dewdney is surprisingly simple. Of the more 
than 700 lines of code in the program, fewer than 40 are dedicated to the 
actual calculations involved in the algorithm. The rest of the program is 
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devoted to dealing with the well-known Macintosh user interface—windows, 
dialog boxes and the like—and to handling the conversion and storage of the 
user-input parameters, which dictate the area of the set to display and at what 
magnification. 

The algorithm discussed by Dewdney involves the use of complex 
numbers. I'll provide a brief overview of the algorithm and refer readers to 
Dewdney's excellent discussion of the subject. Suffice it to say that the 
Mandelbrot set is the result of applying an extremely simple iterative 
function to each point of interest in the complex plane, where the Starting 
value that seeds the function is the position of the number in the plane. The 
result of each iteration is a new complex number; if the size of the 
number—its distance from the (0, 0) origin of the plane—exceeds 2 at any 
point before the iteration runs a predetermined maximum number of times, 
then the point lies outside of the set. If the iteration runs its full course and 
the size of the complex number remains less than or equal to 2, then the 
point lies within the Mandelbrot set. The actual iterative function involves 
nothing more than starting with a value of zero, adding the complex value of 
the point and squaring. Each successive result is then fed back into the 
iterative function. Note that the terms "within" and "without" are relative: a 
"true" rendition of the set would require an infinite number of iterations; 
happily, we can obtain pleasing results with as few as 30 or 40 iterations per 
point. 


Objectives 

I had two major objectives in mind when I wrote this program. The first 
was to produce attractive and interesting images; the second was to produce 
them as quickly as possible. While the algorithm is quite simple, it is also 
extremely computationally intensive. I wanted to explore as much of the sét 
as possible, but did not want to sit around any great length of time before 
being able to see the results of a session. 

One final objective was to build up a library of interesting Mandelbrot 
vistas using the command-3 screen dump facility of the Macintosh. In 
addition to storing the actual graphics image, I also wanted to save all the 
relevant parameters so that I could reproduce the session at my leisure. 

In terms of the attractiveness of the screen display, the fact that the 
program runs on the black-and-white Macintosh places it at somewhat of a 
disadvantage in comparison to other machines. All of the MandelZoom 
programs (so named by Dewdney) that I have seen use color and produce 
strikingly beautiful screen images. What the Mac does have is an exceedingly 
crisp and clean display, at a reasonable 342-by-512 resolution. It also has the 
ability to draw using a variable-size pen and with a user-selected pen pattern. 
Patterns take the place of colors in this implementation; I think that the 
results shown in the accompanying screen dumps are quite pleasing. The real 
beauty of the Mandelbrot images lies not simply in the graphic image of the 
Mandelbrot set itself—the strange, beetle-like object seen in Figure 
18.1—but in allowing the regions adjacent to the boundary of the set to be 
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: 242: 151 seconds 


FIGURE 18.1 Default settings for the program MandelZoom. 


141: 1435 seconds 


FIGURE 18.2 A more accurate rendition of the Mandelbrot set, with 
all patterns disabled except black and white. 
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2H 2: 244 seconds 


FIGURE 18.3 An interesting area, titled "Dragons under an 
alien sun." 


242: 212 seconds j 


-1.9465 


FIGURE 18.4 Zooming in—a miniature Mandelbrot to the left of the 
main set. 
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set off in different colors, or patterns if you will, depending on the number of 
iterations reached before the size of the complex number calculated for each 
point exceeds 2. Half the fun of running this program comes from varying the 
count "breakpoints" that determine the size of each region. 


Fixed-Point Numbers 

The problem of getting the program to run as fast as possible was an 
interesting one. Derivation of the Mandelbrot set requires the use of real 
numbers, since the complex values used in the computations have fractional 
as well as integer components. Most implementations use floating-point 
numbers for this purpose. On the Mac, floating-point support is normally 
provided by a disk-based package known as SANE, for Standard Apple 
Numeric Environment (now provided in ROM on the Mac Plus). SANE, 
however, seemed a bit slow for my purposes. 

I found documentation for three routines in ROM that supported another 
variety of real-number representation known as fixed-point. In fixed-point 
arithmetic, the integer portion of a number is stored as a 16-bit quantity in 
the high-order word of a 4-byte longword, and the fractional portion is stored 
in the low-order word. A bit of informal benchmarking convinced me that 
fixed-point calculations would run roughly an order of magnitude faster than 
floating-point operations, at the cost of some precision; the tradeoff seemed 
reasonable. I didn't discard the SANE package entirely—I used its conversion 
routines for converting the three user-input parameters from string to SANE 
floating-point format, and then converted from the single-precision floating- 
point format back to fixed-point representation. See Listing 18.2 for the 
tedious details. 


ROM Conventions 

The program makes use of a number of the 500-odd routines built into the 
Macintosh ROM. It's beyond the scope of this chapter to discuss all the 
details of how they are used—Apple's Inside Macintosh documentation 
devotes over 1000 pages to that task—but a quick overview might be useful 
for readers unfamiliar with the Macintosh. Most of these routines, or "traps," 
are dedicated to implementing the Mac user interface. Traps can be identified 
in the source code as identifiers preceded by an underscore, such as 
_GetNextEvent or _PenPat. The file "MacTraps.D," which is "included" at 
the top of the main listing, is simply a long list of equates, in which each 
trap name is equated to a unique, 2-byte hexadecimal value that starts with a 
hex $A. This makes use of the 68000 "line 1010 trap" feature ( a hexadecimal 
$A is 1010 in binary), in which execution of any instruction whose first 
nibble is a hex $A forces the processor to suspend its current operations and 
vector through an address in low memory to a trap dispatch table, where the 
following three nibbles of the instruction are decoded to determine which 
particular trap routine to execute. Simple, right? 

Parameter-passing for the ROM routines follows Pascal conventions, in 
which the parameters are pushed onto the stack in the order documented in 
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Inside Macintosh. If a parameter is longer than 4 bytes, a pointer to its 
address is passed instead of the actual data. And if the routine is a function 
(and therefore returns a value), space must be cleared on the stack for the 
function result before the parameters are pushed, and the result popped from 
the stack once the routine returns. Data registers dO, dl and d2, and address 
registers a0 and al are not guaranteed to be preserved by the ROM routines. 
Probably the most common mistake made by new Macintosh assembly 
language programmers is using these registers to hold values that they expect 
to find there after making a trap call. Most of the operating system routines 
found in ROM do not use the above stack-passing convention; a register- 
based parameter-passing convention is used instead. Finally, you should note 
that many of the operands referenced in the program have "(A5)" suffixed to 
their names. That indicates that the operand in question was defined using the 
DS "Define Storage" assembler directive at the end of the source listing. All 
variables so created are referenced in code as on offset off register a5. 


Program Description 

The program uses two dialog boxes and one window. Windows and dialogs 
are two examples of user-interface objects supported by a number of routines 
in the Macintosh ROM. Dialog boxes primarily serve as templates for 
guiding user keyboard input, as well as providing "button" controls that the 
user can click to select among choices. While windows and dialogs can be 
defined in code, it is generally much simpler, and provides better 
documentation, for programmers to define them using Apple's RMaker (or 
"Resource Maker") program. The concept of resources is, as far as I know, 
unique to Macintosh, and it would take much more space than this to do it 
justice. RMaker is generally run last in the development sequence, following 
linking. Listing 18.3 is the source file that is input to RMaker for the 
MandelZoom program. 

The first dialog box, "Parameters," allows the user to select the x- and y- 
coordinates of the region to be plotted, the size of the region and the count 
“preakpoints" that determine which patterns are associated with which count 
ranges. The x- and y-coordinates refer to the lower left-hand corner of the 
drawing window, which comes up once the dialog is dismissed by clicking 
the Plot button. The Side parameter refers to the y-coordinate of the window; 
the length of the image along the x-axis is scaled according to the ratio of the 
window's width to its length. You can cycle through the input fields using 
the tab key. 

The first (top) count on the right side of the dialog box is the maximum 
number of iterations that will be performed for each point. If the program can 
iterate this number of times, the point will be drawn in a solid black pattern. 
If the size of the complex number produced by the iteration exceeds 2 at any 
point, then a lighter pattern will be used. Suitable selection of these four 
"breakpoint" count values allows the user to turn one or more of the patterns 
on or off, or to vary the thickness of the various count "regions." Figure 


A MANDELBROT PROGRAM FOR THE MACINTOSH 363 


18.2, for example, shows a count selection that disables all patterns except 
black and white, for a crisp representation of the Mandelbrot set itself. 

Finally, the dialog allows the user to choose one of three pen sizes using 
the "radio buttons" in the lower left corner of the box. The default selection is 
for a 2-by-2 pen. I usually use the 4-by-4 pen when exploring a new region 
for the first time because it provides a quick (though "chunky") plot. If the 
image looks suitably interesting, I'll continue my explorations using the finer 
2-by-2 pen. The 1-by-1 pen is most suitable for producing high-quality 
images of the boundary of the Mandelbrot set, as shown in Figure 18.2. 

Once the user clicks the OK button at the bottom of the dialog box, the 
dialog is erased from the screen, the Mandelbrot window appears and drawing 
begins. At the same time, a second dialog box appears on the right side of the 
screen. This Legend Dialog primarily serves to redisplay the parameters 
entered at step one, so that a screen dump image saved to disk will maintain a 
complete record of what parameters were initially entered for the session. If at 
any time you aren't satisfied with the image being generated, you can click on 
either New Plot to return to the Parameters Dialog or Quit to exit the 
program. 

The central core of any Macintosh application is the "Event Loop." In 
most Macintosh programs, the trap "_GetNextEvent" is continually polled to 
determine if the user has pressed a key on the keyboard or clicked the mouse 
(among other possible user-initiated events). In this program, the Event Loop 
is executed at the end of each Mandelbrot scan line to determine if the user has 
clicked in either of the above buttons. If he has, the appropriate action is 
taken. 

I should note one final feature of the program. In this version, plotting 
takes place only when the pen pattern changes. In the original version of the 
program, the pen pattern was set and each point plotted using QuickDraw's 
_Line command as soon as the iteration for each point was completed. I 
found, however, that plotting ran about 20 percent quicker if I deferred the 
actual drawing until forced to by a change in the pen pattern: one long _Line 
is quicker than many short ones. You can force the program to plot each 
point as it's calculated by holding down the mouse when the program first 
launches. While this actually takes a bit longer to plot, you might find it 
subjectively faster (since the pen is continually drawing). 
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Listing 18.1 
INCLUDE MacTraps.D 
String_Format 3 
XREF Convert_2_ Fixed Point 
MACRO Fix_Squared Rn 

elr.1 -(sp) 

move.1 {Rn}, -(sp) 

move.1 {Rn}, -(sp) 

_FixMul 

move .1 (sp)+, {Rn} 
MouseDown EQU 1 
numToString EQU 0 
stringToNum EQU 1 
Gray EQU -24 
White EQU -8 
portRect EQU 16 
pnPat EQU 58 
X_Screen_Offset equ 4 
Y_Screen_Offset equ 4 
Row_Pixels equ 256 
Col_Pixels equ 256 
PenSize equ 2 
HiLite_Off equ 0 
HiLite_ On equ 1 
Radio_Item_1 equ 9 
Radio_Item_2 equ 10 
Radio_Item_3 equ AL 
X_Org_Item equ ale 
Y_Org_ Item equ 13. 
Side_Length_Item equ 14 
Count_Item_1 equ LS: 
Org _ Spacing equ 24 
Max Count_Digits equ 4 
Count_Str_X equ 5 
Count_Str_Y equ 114 
Count_Str_Size equ 10 
Legend _Plot_Item equ HE 
Legend_Quit_Item equ 2 
Pattern Spacing equ 30 
Pattern_X equ 62 
Pattern_Y equ 86 
Pattern_Size equ 8 
X_Org Scr_X equ 10 
X_Org Scr_Y equ 24 


ses 


se ose 


pre-length DC.B and PEA Strings 
Procedure defined in < Str2FP.ASM > 


a Mac-style macro 
( Mac-Mac ? ) 


for _GetNextEvent 
for _Pack7 conversions 


offset from QDVars Ptr 


offset from start of Window Record 
offset from start of Window Record 


Item Numbers in Params DITL 


Space tween X, Y, and S 
Num Digits in 'Count' Item Strings 


X_coord of Counts 
Y_coord of lst ( Max ) Count 
Bytes wide 


Delta-Y for both Counts & Patts 
Left for Patts in Legend DLOG 
Top for lst Patt in Legend DLOG 
Bytes 
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Time_Scr_X 
Time_Scr_ Y 


st 
sf 
st 
sf 


BSR 

BSR 

BSR 
MainLine 

BSR 

tst.b 

BNE.s 


BSR 


@Set_Radios 


equ 10 
equ 16 


First_Entry(A5) 

Radio_1_ State (A5) 
Radio 2 State (A5) 
Radio_3_ State (A5) 


InitManagers 
Save Mouse State 
Draw_Menu_Title 


Open_Params_DLOG 


First_Entry (A5) 
@Set_Radios 
Reload_DITL 


sf First_Entry (A5) 
BSR Set_Radio Buttons 
BSR Get_Param_Items 
bMI Exit_To_ Shell 

BSR Save_Param Items 
pea PparamsDLOGStorage 
_CloseDialog 

BSR Draw_Mandel_Window 
BSR Open_Legend_DLOG 
BSR Draw Patterns 

BSR Draw_Org Strings 
BSR Timer On 

BSR Do_Mandelbrot 

bMI Exit_To Shell 

bHI Do_Another 

BSR Timer Off 

BSR Write _Time 


Wait_4 Command 


BSR 
bEQ 
BSR 
bEQ 
bMI 


Do_Another 


pea 


Get_Next_Event 
Wait_4 Command 
Was Dialog Event 
Wait_4 Command 
Exit_To_Shell 


MandelWindStorage 


_CloseWindow 


; 


, 


Nese 


default Pen is 2 X 2 


2nd time around - 
get old Parameters 


Get User Choice / if OK, Toggle Radio 
Buttons, Convert & Save Counts 


Save Str Counts / Convert 3 Fix-Pt Nums 


These 2 in case we've interrupted 
plotting in the middle 


No Event 


0 => Hang Around a Bit 
- => Quit 
+ => do another 


366 DR. DOBB'S TOOLBOOK OF 68000 PROGRAMMING 


pea 


LegendDLOGStorage 


_CloseDialog 


BRA 


Exit_To_Shell 


MainLine 


_ExitToShell 


Save_Mouse State 


sf 

clr 

Button 

tst 

beq 

st 
@rts RTS 


Draw_Menu_Title 


Mouse_Down (A5) 
- (sp) 


(sp) + 
@rts 
Mouse_Down (A5) 


move.1 #SO00F0010, -(sp) 
MoveTo 
pea MBarTitle 
DrawString 
RTS 

Reload_DITL 
lea TempStr, A2 
move #0, D3 
move #X_Org_ Item, D4 
BSR Get_Item_Text 
move.1 ItemHandle, -(sp) 
pea X_Org Str 
SetIText 
move #Y_Org Item, D4 
BSR Get_Item_Text 
move.1 ItemHandle, - (sp) 
pea Y_ Org Str 
SetIText 
move #Side_Length_Item, D4 
BSR Get_Item_Text 
move.1 ItemHandle, -(sp) 
pea Side_Length 
SetIText 
lea TempStr, A2 
lea Count_Strings, A3 
move #0, D3 
move #Count_Item_1, D4 


@Reset_Counts 


BSR 
move.1 


Get_Item_Text 
ItemHandle, - (sp) 


; if the Mouse is Down on Launch, we'll 
+ _SetPat and Line for EVERY Point 
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move.1 A3, -(sp) 


SetIText 

add.1 #Count_Str Size, A3 
add #1, D4 

add #1, D3 

cmp #4, D3 

BMI @Reset_Counts 

RTS 


Get_Next_Event 


clr -(sp) 

move #-1, -(sp) 
pea EventRecord 
GetNextEvent 

tet.b (sp) + 

RTS 


Was_Dialog Event 


eln.ib - (sp) 
pea EventRecord 
IsDialogEvent 
tst.b (sp) + 
bNE.s @1 
RTS 

@1 clr.b - (sp) 
pea EventRecord 
pea theDialog 
pea ItemHit 
DialogSelect 
tst.b (sp) + 
bNE.s Get_Legend_DLOG_ Item 
elz.b -(sp) 
pea EventRecord 
pea ParamsDLOGStorage 
pea ItemHit 
DialogSelect 
tistics b: (sp) + 
move #0, DO 
RTS 


Get_Legend_DLOG_ Item 


move ItemHit, DO 

cmp #Legend_Plot_Item, DO 
bEQ @Return_Plus 

cmp #Legend_ Quit Item, DO 
bEQ @Return_Minus 

move #0, DO 


RTS 


a 


ee ee 


Addr of Current Count_String 


next Item Number in DLOG 
increment loop counter 
done all 4 ? 


no 
EQ = No Event 
NE = Was DLOG Event 
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@Return_Minus 


move #-1, DO 
RTS 


@Return_Plus 


Timer _On 


move #1, DO 
RTS 

elr..1 -(sp) 
TickCount 


move.1 (sp)+, Start_Time (A5) 
RTS 


Timer Off 


MenuRect 


PenNormal 
move #4, —(sp) 
SysBeep 


eat. -(sp) 

TickCount 

move.1 (sp)+, D3 

sub.1 Start_Time(A5), D3 
divu #60, D3 

RTS 


dc 0, 10, 19, 200 


Write _Time 


@Loop_1 


pea MandelWindStorage 
pea TempSTR 
GetWTitle 

lea TempSTR, a2 
move.1 a2, a3 

elr.l ds 

move .b (a2)+, aS 


adda.1 ds, a2 


lea *e %, a0 
clr.1 dl 
move.b (a0)+, dl 


add.b dl, a5 
move .b d5, (a3) 
sub #1, dl 


move.b (a0)+, (a2)+ 


dora dl, @Loop_1 
move .b -(a2), d4 
ext.1 D3 


move.1 D3, DO 


‘ 


~ 


= Quitting 


= Do Another Mandelbrot 


Wake the Poor User 


( Stop - Start ) in Ticks 
Num_Seconds (in Low Word) 


save start addr 

clear out old junk 

Length Byte 

point past last Char in Str 
addr of length byte 


save new length 


put back new length byte 


add new string to end 


save last char 


Elapsed Time in seconds 
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move.1 a2, a0 
move #NumToString, - (sp) 
Pack? 
1a dl 
move .b (a2), dl 
move .b d4, (a2) 
add.b dl, d5 
move .b a5, (a3) 
adda.1 dl, a2 
adda.1 #1, a2 
lea " seconds', al 
move .b (al)+, dl 
ext.w dl 
add.b dl, d5 
move .b d5, (a3) 
@Loop_2 
move .b {al)+, (a2)+ 
sub #1, dl 
bhi @Loop 2 
pea MandelWindStorage 
pea TempSTR 
SetWTitle 
RTS 
Do_Mandelbrot 
pea MandelWindStorage 
move.1 (sp), -(sp) 
_SetPort 
tst.b Radio_1 State (A5) 
beq.s @1 
pea “1 x 
bra @SetTitle 
@1 tst.b Radio 2 State (A5) 
beq.s @2 
pea "2X 2" 
bra.s @SetTitle 
@2 pea "4 x 4! 
@SetTitle 
_SetWTitle 


@Set_Pen Size 
move 
tst.b 
BNE 


add 
tst.b 
BNE 
move 


#PenSize, D3 
Radio 2 State (A5) 
@Set_Pen 


D3, D3 

Radio_3_ State (A5) 
@Set_Pen 

#1, D3 


save New Length Byte 
restore last Char of lst String 


new length 
and put back in Length Byte 


; point to end of string 
; points 1 past end 


save new Length Byte 


new total Length of Strings 


; put it back in Length Byte 


append 'Seconds' to end 


# copy WPtr for SetWTitle trap 
; draw in this Window 


Draw with 2 X 2 Pen ? 


yes 
Pen = 4 X 4 

Draw with 4 X 4 ? 
yes 


; Draw with 1 X 1 
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@Set_Pen 
move D3, Pix Per Pt (A5) 
move D3, -(sp) 
move (sp), -(sp) 
PenSize 


@Set_Plot_Size 


lea MandelWindStorage, a0 

move portRect+4(a0), d4 7 Window.Bottom 

move #Y_Screen_ Offset, d0 

sub do, d4 ; frame at Bott 

sub da3, d4 * move up 1 PenSize from Bott 
move d4, Y Start (a5) 

sub do, 4 ; adjust for frame at Top 
move d4, Num_Rows (a5) 

move portRect+6(a0), 4 7 Window.Right 

move #X_Screen_Offset, d0 

asl #1, dd ; frame at Left & Right 
sub do, d4 

sub a3, d4 * allow for penWidth 

move d4, Num_Cols(a5) 


@Get_C_Increment 


move.l Y_side(A5), DO 


move Num_Rows (A5), D5 

ext.1 d5 

divu Pix Per Pt(A5), D5 ; = # of Plottable Pts on Y-Axis 
BSR Get_Del_ Factor + Del_Y returned as Fixed Pt 
move.1 D4, Del_C_imag(A5) + in D4 

elr.1 -(sp) 

move Num_Cols(a5), -(sp) 7 numerator 

move Num_Rows (a5), - (sp) ; denominator 

FixRatio ; Fixed-Pt Ratio on stack 
move.1 (sp)+, dQ 7 temp save it 

he. t - (sp) 

move.l1 dO, -(sp) # Num_Cols/Num_Rows 

move.1 Y_Side(a5), -(sp) + x Y_Side 

FixMul ) Soprosa= 

move .1 (sp)+, X_Side (a5) + = X_Side 


move.l1 xX _Side(A5), DO 


move Num_Cols(A5), D5 

ext, 1 ds 

divu Pix Per Pt(A5), D5 7 = # of Plottable Pts on X-Axis 
BSR Get_Del_ Factor 


move.1 D4, Del_C Real (A5) 


BRA Continue 


Get_Del_ Factor 


move.1 DO, D3 + save the fractional part 
swap DO 7 and get the whole part 
elr.1 -(sp) 


move DO, -(sp) + side ( integer part ) 
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move DS, -(sp) 7 pts per side 
FixRatio + = length ( integer ) / point 
move.1 (sp)+, D4 * save the ( int ) fraction 
sf D6 7 positive fraction 
tst D3 
bpl @1 
st D6 * negative fraction 
lsr #1, D3 # zero the hi bit so _FixRatio doesn't 
*; think the number is Negative 
@1 cir.1 -(sp) 
move D3, -(sp) + side ( fract part ) 
move DS, —(sp) 7 pts per side 
FixRatio + = length ( fract ) / point 
move.1 (sp) +, D3 
swap D3 # move the 'integer' part of the 'fraction' 
tst.b D6 * back into the fractional lo word 
bpl @2 
1lsl #1, D3 7; restore the 'negative' hi bit 
@2 and.1l #SFFFF, D3 


add.1 D3, D4 


RTS 
Continue 
olz Row_Count (A5) 
move.1 Y_Origin(A5), C_Imag(A5) 
move Y_Start (a5), Y Current (A5) 


Do_Next_Row 


clr Col_Count (A5) 

move #X_Screen_ Offset, -(sp) ; For next row 

move Y_Current (A5), -(sp) 7 move absolute to start 
MoveTo 

st First_Pt (A5) ; lst_Point := TRUE; 
move .1 X_Origin(A5), C_Real (A5) + for start of new row 
BSR Do_Points 

move Row_Count (A5), DO 

add Pix Per Pt(A5), DO 

move DO, Row_Count (A5) 

cmp Num_Rows (A5), DO 

bMI @CheckDLOG 

move #0, dO 


BRA @Return_To_Mainline 
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@CheckDLOG 
BSR Get_Next_Event 
bEQ.s @Setup_Next_Row 
BSR Was_Dialog Event 
bEQ.s @Setup_Next_Row 


@Return_To_ Mainline 
RTS 


@Setup_Next_Row 


move Pix Per Pt(A5), DO 
sub DO, Y Current (A5) 
move.1 C_imag(A5), DO 


add.1 Del_C_Imag(A5), DO 
move.1 DO, C_Imag(A5) 


BRA.S Do_Next_Row 


Do_Points 
move.1 C_Real(A5), DS 
move .1 C_Imag(A5), D6 
move #1, Iter_Count (A5) 
lea Patterns, A4 
Iterate 


move .1 DS, ‘DS 
move.1 D6, D4 


Fix_Squared D3 
Fix Squared D4 
move .1 D4, D7 
add.1 D3, D7 


@Test_Size 
move Iter_Count (A5), DO 
cmp.1 #$40000, D7 
BHI.s @Plot 


@Test_Count 


add #1, DO 
move DO, Iter Count (A5) 
cmp Counts (A5), DO 
BPL.s @Plot 

@Get_New_Z 


sub.1 D4, D3 


clr.1 -(sp) 
move.1 D5, -(sp) 
move.1 D6, -(sp) 
FixMul 

move.1 (sp)+, D6 


add.1 D6, D6 
move.1 D3; DS 


; 


; 


? 


set up Y for next row 


Initialize Z = C for new point 


Do up to Counts(A5) times per Point 
reset Pattern Ptr 


Save Current Z Real 
Save Current Z_Imag 


Z Real*2 
Z_Imag*2 


Size*2 = Z Real*2 + Z_Imag*2 


Size*2 > 4 means TIME TO PLOT 


Z_Real Z Real*2 - Z Imag*2 


Z Real * 2 Imag 
Z_Imag 2 * Z Real * Z Imag 
Z_Real 
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add.1 C_Real(A5), DS 
add.1 C_Imag(A5), D6 
BRA.s Iterate 


@Plot 
BSR Get_Pattern + A4 = Ptr to New Pattern 
tst.b First_Pt (A5) 7 IF NOT 1st_Point 
BEQ @Test_Mouse ; Test_Mouse { see if batching } 
sf First_Pt (AS) + ELSE 
; lst_Point = FALSE; 
move.1 A4, A2 ; Old_Pat := New Pat; 
BRA @Set_Pattern 


@Test_Mouse 


tst.b Mouse_Down (A5) IF NOT Batch_Plot 


BNE.s @Draw_Line ; Draw_Line ( Old Pat ) 

cemp.1 A2, A4 3; ELSE 

BNE @Draw_Line ; IF NOT ( New_Pat = Old Pat ) 
; Draw_Line ( Old Pat ) 
; ELSE 
; Init_Line_ Amount; 

add Pix Per Pt(A5), A3 ; Do_Next_Pt; 

BRA @Skip Draw 


@Draw_Line 


move A3, -(sp) 
move #0, -(sp) 
Line + Draw_Line ( Old Pat ) 


@Set_Pattern 


move.l A4, -(sp) 7 set the New Pattern 
_PenPat 


move.1 A4, A2 Old_Pat := New Pat 


~ 


move Pix Per Pt(A5), A3 
@Skip Draw 

move Col_Count (A5), DO 

add Pix Per Pt(A5), DO 

move DO, Col Count (A5) 

cmp Num_Cols(A5), DO 


BMI.s @Update_Z Real 


; we've finished the Line - if we 
* need to draw to finish up 
; do it here 


cmp Pix Per Pt(A5), A3 

BEQ @rts 7 we've just drawn 

move A3, -(sp) + else draw what we didn't 
move #0, -(sp) 


Line 
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@rts RTS 


@Update_Z Real 
move.1 C_Real(A5), DO 
add.1 Del_C_Real(A5), DO 
move.1 DO, C_Real (A5) 


BRA Do_Points 


Get_Pattern Point to a New PenPat, according to which 


Range the Iter Count ( DO ) falls in 


cmp 0+Counts(A5), DO 3; >= Black 
BPL.s @0 
add #8, A4 
cmp 2+Counts (A5), DO ; >= DarkGray 
BPL.s @0 
add #8, Ad 
cmp 4+Counts(A5), DO 7; >= LtGray 
BPL.s @o 
add #8, A4 
cmp 6+Counts(A5), DO ; >= White 
BPL.s @0 
add #8, A4 3; < Gray 

@0 RTS 

Open_Params_DLOG 
éelr1 - (sp) + space for funct result 
move #100, -(sp) 
pea ParamsDLOGStorage 
move.1 #-1, -(sp) 7 in front of everything 
GetNewDialog 
move.1 (sp)+, dO 
RTS 


Set_Radio_Buttons 
move #HiLite_On, D3 


tst.b Radio_1_ State (A5) 


BPL @2 
move #Radio_Item_1, D4 
BRA HiLite_Control 

@2 tst.b Radio _2_ State (A5) 
BPL @3 
move #Radio_ Item _2, D4 
BRA HiLite_Control 

@3 move #Radio_Item_3, D4 


BRA HiLite_ Control 
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HiLite Control 


pea ParamsDLOGStorage 
move D4, -(sp) 

pea ItemType 

pea ItemHandle 

pea ItemBox 
_GetDItem 

move.1 ItemHandle, -(sp) 
move D3, -(sp) 
_SetCtlValue 

RTS 


Get_Param_Items 


pea ParamsDLOGStorage 
move #X_Org_ Item, -(sp) 
move #0, -(sp) 
move #32767, -(sp) 
SelIText 

Moda1DLOG 
onli cea 3 -(sp) 
pea ItemHit 
ModalDialog 
move ItemHit, DO 
tst DO 
BEQ Moda1DLOG 
cmp #1, DO 
BEQ Validate_Items 
cmp #2, DO 
BEQ Set_Exit_Flag 
cmp #Radio_Item_1, DO 
bMI ModalDLOG 
cmp #Radio_Item_3, DO 
bHI Moda1lDLOG 
BSR Toggle Radio Buttons 
BRA ModalDLOG 


Set_Exit_Flag 


move 
RTS 


#-1, DO 


Toggle Radio Buttons 


move 
move 
move 
BSR 

move 
BSR 

move 


DO, DS 

#HiLite Off, D3 
#Radio_Item_1, D4 
HiLite_ Control 
#Radio_Item_2, D4 
HiLite Control 
#Radio_Item_3, D4 


+ ItemNumber 


+ Select 'X_Org' Parameter 
+ for Quick Replacement 


7 no filterProc 


Clicked 'OK' = We're Done Dialoging ? 

7 yes - Validate & Convert numeric entries 
Clicked 'Quit' ? 

7 yes - tell MainLine 


+ Clicked a Radio Button for penSize ? 
* no - wait for 'OK' or 'Quit' 


no - wait for 'OK' or 'Quit' 
yes 
and wait for 'OK' or 'Quit' 


+ ( DO gets trashed by ROM calls ) 


* turn off Everything 


376 DR. DOBB'S TOOLBOOK OF 68000 PROGRAMMING 


BSR HiLite Control 
sf Radio 1 State (A5) ; Flag them as OFF 
sf Radio 2 State (A5) 
sf Radio_3_ State (A5) 
move #HiLite_On, D3 ; turn ON the Radio Item 
move D5, D4 7 that was Clicked 
BSR HiLite_ Control 
cmp #Radio_ Item_1, D5 ; and Flag the apt Item 
BNE @2 7 as ON 
st Radio_1_State(A5) 
RTS 
@2 cmp #Radio_Item_2, D5 
BNE @3 
st Radio_2_ State (A5) 
RTS 
@3 st Radio_3_State(A5) 
RTS 
Validate_Items 
move #0, D3 # Count - 1 
lea Count_Strings, A2 
move #Count_Item_1, D4 7 Item Number of 1st Count ( MaxCount ) 
@0 BSR Get_Item_Text 7 get the next Item Text 
BSR Convert_2 Int ; Convert theString to Integer 
add #10, A2 7 point to next String 
add #1, D4 + and its Item Number 
add #1, D3 ; for Next Count Range 
cmp #4, D3 ; Done all 4 Count Ranges ? 
BMI @0 7 not yet 
RTS ; Return to MainLine 


~ 


Get_Item_Text A2 points to the String 


pea ParamsDLOGStorage 7 DLOG Ptr 

move D4, -(sp) 3; Item Number 

pea ItemType 3 Not Used 

pea ItemHandle 7 passed to following ROM call 
pea ItemBox ; Not Used 

GetDItem 

move.1 ItemHandle, -(sp) 

move.1 A2, -(sp) 

Get IText 


RTS 
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Convert_2_ Int 


move.1 
move 
Pack? 
move 
add 
lea 
move 


RTS 


Save_Param_ Items 


move 
move 
lea 
BSR 


Draw_Org Strings 


A2, AO 
#StringToNum, -(sp) 


D3, DS 

D5, DS 

Counts (A5), AO 
DO, O(A0, D5) 


#0, D3 

#X_Org Item, D4 
X_Org Str, A2 
Get_Item_Text 


Convert_2 Fixed Point 
DO, X_Origin(A5) 


#Y_Org_ Item, D4 
Y_Org Str, A2 
Get_Item_Text 
Convert_2 Fixed Point 
DO, Y_Origin(A5) 


#Side_Length_Item, D4 
Side Length, A2 
Get_Item_Text 
Convert_2_ Fixed Point 
DO, Y_Side(A5) 


move #X_Org Scr_X, -(sp) 
move #X_Org_Scr_Y, D3 
move D3, -(sp) 

MoveTo 

pea UX X 

DrawString 

pea X_Org Str 
DrawString 

move #X_Org Scr_X, -(sp) 
add #O0rg Spacing, D3 
move D3, -(sp) 

MoveTo 

pea HY, ’ 

DrawString 

pea Y_Org Str 


DrawString 


Convert Count to Numeric 
Which Count Range ? 
Words => Bytes for Offset 


Index & Save the Count 
( Ignore the Hi Byte ) 


D3 not used here 


Following routine deposits 
DITL text in (A2) 


A2 (input) points to Decimal DITL String 
DO (returned) contains Fixed-Point Conversion 


XREF routine to convert from 
STR format to Fixed-Point 
format via SANE intermediary 
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move #X_Org_Scr_X, -(sp) 
add #Org_ Spacing, D3 
move D3, ~-(sp) 

MoveTo 

pea 'S . 

DrawString 

pea Side_Length 
DrawString 

RTS 


Open_Legend_DLOG 
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space for Funct result 


in front of everything 


so Title prints 
so _DialogSelect works 


Save Digit Char Width 
in D4 for Right-Justifying 


raul Ss gute 8 -(sp) ; 
move #101, -(sp) 
pea LegendDLOGStorage 
move.1 #-1, -(sp) 7 
GetNewDialog 
move.1 (sp), -(sp) 
SetPort ; 
SelectWindow ry 
RTS 

Draw_Patterns 
clr - (sp) ; 
move #'1",, —¢sp) 7 
CharWidth 
move (sp)+, D4 
move #Count_Str_Y, Legend_Y_Pos(A5) 
move #0, D3 
lea Count_Strings, A3 i 


@Draw_ Counts 


move #Count_Str_X, - (sp) 

move Legend_Y Pos(A5), -(sp) 

MoveTo 

cmp .b #Max_Count_Digits, (A3) 

BMI @0 

move.b #Max_Count_Digits, (A3) 
@0 elon. 1 DO 

move #Max_Count_Digits, DO 

clr D1 

move .b (A3), D1 

sub D1, DO 

mulu D4, DO 

move.1 DO, -(sp) 

Move 

move.1l A3, -(sp) 

DrawString 

move.1l (A5), A2 

pea Gray (A2) 


Addr of lst Count Str 


Truncate STRs if too long 


Right-Justify Count_Strings 


Byte Count for String 


Del Digits = Max Digits - Actual Digits 


times Digit Char Width 
= amount to space over 
Relative Move 


Write the Count Range Str 


QD Vars Ptr 
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PenPat 
move.1 
Move 
move.1 
Line 


PenNormal 


move 
add 
move 
add.1 


add 
cmp 
BMI 


move 
move 
lea 


@Draw_Patterns 


move 
swap 
move 


lea 
move.1 
add.1 
move.1 


pea 
move.1 
move .1 


FrameRect 


move.1 


InsetRect 


move.1 
FillRect 


move 
add 
move 
add.1 


add 
cmp 


BMI 


RTS 


#SFFFBO006, -(sp) 


#$0000000D, -(sp) ; 


Legend_Y_Pos(A5), DO ; 
#Pattern Spacing, DO 
DO, Legend_Y_Pos(A5) 
#Count_Str_Size, A3 


#1, D3 
#4, D3 
@Draw_Counts 


#Pattern_Y, Legend_Y_ Pos (A5) 
#0, D3 


Patterns, A3 


Legend_Y Pos(A5), DO 
DO 
#Pattern_X, DO 


TempRect, AO 

DO, (AQ)+ ; 
#$00130013, DO ; 
DO, (AO) ; 


TempRect 
(sp), -(sp) 
(sp), -(sp) 


#$00010001, -(sp) 


A3, -(sp) 


Legend_Y Pos(A5), DO H 
#Pattern Spacing, DO 
DO, Legend_Y_ Pos (A5) 
#Pattern Size, A3 ; 


#1, D3 
#5, D3 
@Draw Patterns 


Draw_Mandel_Window 


clr.1 -(sp) 

move #101, -(sp) 

pea MandelWindStorage 

move .1 #-1, -(sp) 

GetNewWindow 

SetPort ? 
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Move Up & Over a Bit 
Draw a Short Gray Line to 
separate the Patt_Rects 


Back to Black for Next String 


move down for Next String 


point to Next String 


Top 


Left 


TopLeft 
19 x 19 
BottomRight 


push 2 copies of Rect Addr 


move down for Next String 


point to Next Pattern 


nuthin hops if we don't do this 
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lea MandelWindStorage, 
pea portRect (A0) 
EraseRect 
RTS 
InitManagers 
pea -4 (AS) 
_InitGraf 
_InitFonts 
_InitWindows 
_InitMenus 
cir.L - (sp) 
_InitDialogs 
_TEInit 
_InitCursor 
RTS 
} SRS Se re SSeer= Constants 
MBarTitle de.b "Drs 
ALIGN 2 
MandelWindStorage dcb.b 156, 0 
ParamsDLOGStorage dcb.b 170, 0 
LegendDLOGStorage dcb.b 110;,> © 
ItemHit dc.w 0 
ItemType dc.w 0 
ItemHandle dc.1 0 
ItemBox dcb.1 2, 0 
theString dcb.b 256, 0 
theDialog del 0 
X_Org_ Str dcb.b 10, 0 
Y Org Str dcb.b 10, 0 
Side_Length dcb.b 10, 0 
Count_Strings dcb.b 40, 0 
TempRect deb.1 2, 0 
TempSTR dcb.b 40, 0 
EventRecord 
What: dc.w 0 
Message: dc.1 0 
When: de.1 0 
Where: de.1 0 
Modifiers: dc.w 0 
Patterns 
de.1 SFFFFEFFF 
de.l SFFFFFFFF 
de:..1, SFFAAFFAA 
dc.1 SFFAAFFAA 


( in Code Space ) 


AO 


7 


the slings and arrows .. . 


Dobb''s MandelZoom' 


? 


’ 


Not Used 


passed from GetDItem to _GetIText 


Not Used 


for dialogPtr returned by _IsDialogEvent 


4 X 10 Bytes each 


holds the Patt Rects for the Legend 


4 pixels per 4 


3 pixels per 4 


black 


dark gray 
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dc.1 
dc.1 
de.1 
dc.1 
dc.1 
desl 

X_Origin 

Y_Origin 

X_Side 

Y_Side 

Y_ Start 

Num_Rows 

Num_Cols 

Pix Per Pt 

Counts 


Iter_Count 
Row_Count 
Col_Count 


y_current 
x_current 


C_Real 


C_imag 


Z_Real 

Z_imag 

Del_C Real 
Del_C_imag 
Legend_Y_Pos 
Start_Time 
Radio 1 State 
Radio 2 State 
Radio_3 State 
First_Entry 
Mouse_Down 


First_Pt 


END 


SAA00AA00 
SAAO0AA00 


$00000000 
$00000000 


SAAS5AAS5 
SAA5S5SAA55 


ds.b 


ds.b 


Variables ( off AS ) 


PRPRP PR 


rPRPPR 


PRPRPPH 


7 


; 


1 pixel per 4 light gray 


0 pixels per 4 pure white 


2 pixels per 4 = gray 


Fixed-Pt conversions from 
User Entries in Params DLOG 


Set to X_Side for now 


where Pen first Plots 


4 INTEGERs dividing the Iterative 
Domain into 5 Ranges (& Patterns) 
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Listing 18.2 


; This File is Linked with MandelZoom.ASM to provide String-to Floating-Point 
; and Floating-Point to Fixed-Point Conversions for the 
; X_Org, Y_Org, and Side_Len DITL Parameters 


; At present only Single-Precision SANE Conversions are used 


; A2 = ptr to the Decimal String on Input 
7; DO the Fixed-Pt Number for Output 


String Format 3 


Include MacTraps.D 
Include SANEMacs.Txt 


XDEF Convert_2_Fixed_Point 

Sign EQU 0 ; Byte Offsets in Decimal Record 

Exp EQU 2 

Sig EQU 4 

FP Sign EQU 31 ; Bit Offsets in Single-Precision Result 
FP_Exp EQU 30 

FP Sig EQU 22 

SP_Exp_Bias EQU 127 

DP_Exp_Bias EQU 1023 ; Code for Double-Precision not written 


Convert_2_ Fixed_Point 


lea Temp String, a0 ; make a copy of incoming string 
move .b (a2), dd ; its length 
@0 move.b (a2)+, (a0)+ 
DBRA do, @0 
lea Temp String, a0 ; replace ptr to theString 
BSR Build _Decimal_Record 
pea Decimal Record 
pea FP_Num 
FDEC2S ; a SANE 'trap' 
BSR Build _Fixed_Pt 
RTS ; to Mandelbrot 


Build Fixed Pt 


sf d2 ; assume Positive 

lea FP_Num, a0 

move.1 (a0), dl 7; save the Exponent 
18/1331: #1, dl ; shift Sign into Carry 
bee @1 ; was Positve 


st d2 ; flag as Negative 
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@1 swap dl 7 Move Exp into Low Word 
lsr.w #8, dl + shift Exp to Right 
sub.b #SP_Exp Bias, dl # unbias it 


move.1 (a0), do 
and.1 #SOO7FFFFF, dO 
bset.1 #23, dd 


move orig FP_Num into register 
clear Exp 
add the leading '1' bit 


tst.b d2 
beq @2 7 num is + 
neg.1l do 
@2 sub.b #7, dal + Neg(7-Exp) is amount to shift 
neg.b dl 
bmi @Shift_Left 
asr.1 dl, do 
bra @rts 


@Shift_Left 
neg.b dl # Max Left Shift not checked for yet 
asl.l dl, do 


@rts RTS 


Build_Decimal_ Record 


+ Strip the Sign Char 
* Strip the Decimal Pt and Decrease the Exponent Accordingly 
+ Finally Strip Leading Zeroes 


lea Decimal Record, al + Zero the Record 

move.l #0, (a1)+ 

move.1 #0, (al)+ 

move.1 #0, (al)+ 

move.1 #0, (al)+ 

lea Decimal Record, al 

cmp.b #'+', 1(a0) * Strip the Plus Sign, if any 
BNE @Strip_ Minus Sign 

BSR Shift_Count_Byte 

bra Strip Decimal _ Pt 


@Strip Minus Sign 


cmp.b #'-', 1(a0) 


BNE Strip Decimal Pt 
move .b #1, Sign (al) * Mark Dec_Rec Sign as Negative 
BSR Shift_Count_Byte 


Strip Decimal Pt 


move .b (a0), Sig(al) # move Count to Decimal Record 

lea 1+Sig(al), a2 * point to lst Digit 

add.1 #1, a0 # points to lst Digit in Sre Str 

clr do 

move.b Sig(al), do + length of String 

sub.b #1, dO + - 1 for DBRA 

sf dl # Passed_Decimal Pt Flag = FALSE 
@0 cmp.b #"%,, (ad) 


beq @Found_Decimal_ Pt 
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move.b 
tst.b 


beq 
sub 
bra 


@Found_Decimal_ Pt 


st 


add.1 
sub.b 


@Test_EOStr 


DBRA 


(a0)+, (a2)+ 
dl 
@Test_EOStr 
#1, Exp(al) 
@Test_EOStr 


dl 
#1, a0 
#1, Sig(al) 


do, @0 


Strip _Leading_Zeroes 


lea 


move .b 
sub.b 
ext.w 


sf 


@Loop_1 


cmp.b 


BNE 
st 
BSR 
DBRA 


@Test_4 Shift 


tst.b 


BEQ 


lea 


move.b 
ext.w 


@Loop_2 


move .b 


DBRA 


@rts RTS 


Shift_Count_Byte 


sub.b 
move.b 


RTS 


Sig(al), a0 


(a0), do 
#1, d0d 

do 

dl 

#'0', 1(a0) 


@Test_4 Shift 
dl 


Shift_Count_Byte 


d0, @Loop_1l 


dl 
@rts 


Sig(al), al 


(a0), dod 
do 


(a0)+, (al)+ 
d0, @Loop_2 


#1, (a0) 
(a0)+, (a0) 


dc.b 8, 0 


, 


oo 


~~ 


shift the digit to Decimal Rec 
are we past the Decimal Pt ? 


point past decimal point 
Count := Count - 1 


point @ Count Byte in 
Decimal_Record Sig Field 
setup for DBRA 


No Leading Zeroes (yet) 


Encountered a Signif Digit -> Done 


Any Non-Significant Zeroes Found ? 
no 


point to Sig Count Byte 
aQ is Count Byte (wherever it is) 


shift Count + Digits to Left 


Length = Length - 1 
move Count Byte over one 


working space for both 
Single & Double Precision numbers 
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Decimal_Record 


dc.w 0 7 Sign 

dc.w 0 7; Exp 

dcb.b 12, 0 7; Sig 
Temp String dcb.b 12; 0 


END 


Listing 18.3 


* Output file = 


Mandels:Mandel_2 
APPLKATZ 


* Input file = 
Include Mandels:Mandel_2.CodeR 


Type WIND 77 Mandelbrot Window 
g LOL 


41 6 336 404 
Visible NoGoAway 


0 77 docProc 

0 

Type DLOG 7; 'Legend' Dialog 
7,101 


no message 

30 415 330 500 

Visible NoGoAway 

1 77 DBoxProc 


101 


Type DITL 
,101 
2 77 2 Items 


Button 77 Item #1 
240 3 265 83 
New Plot 


Button 77 Item #2 
270 3 295 83 
Quit 


Type DLOG 77 Parameters Dialog 
, 100 
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no message 

50 100 250 400 

Visible NoGoAway 

1 37 GBoxProc 


100 


Type DITL 
, 100 
18 77 18 Items 


Button 7¢ Item #1 
135 130 160 220 
Plot 


Button 77 Item #2 
165 130 190 220 
Quit 


StaticText Disabled 77 Item #3 
8 30 25 235 
Mandelbrot Parameters 


StaticText Disabled 77 Item #4 
40 15 56 75 
X_Origin 


StaticText Disabled 77 Item #5 
70 15 86 75 
Y_Origin 


StaticText Disabled 77 Item #6 
100 15 116 75 
Side 


StaticText Disabled 77 Item #7 
65 180 80 240 
Counts 


StaticText Disabled 77 Item #8 
155 15 170 45 
Pen 


RadioButton 77 Item #9 
135 50 150 105 
1X1 


RadioButton 77 Item #10 
155 ‘50: 170 105 
2x2 


RadioButton 77 Item #11 
175 50 190 105 
4X4 


EditText 77 Item #12 X_Origin 
40 80 55 144 
~2.00 
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EditText 77 Item #13 Y_Origin 
70 80 85 144 
1.25 


EditText 77 Item #14 Side Length 
100 80 115 144 
2.500 


EditText 77 Item #15 4 initial Defaults for Patt Ranges 
20 245 35 280 
32 


EditText 77 Item #16 
50 245 65 280 
12 


EditText 77 Item #17 
80 245 95 280 
6 


EditText 77 Item #18 
110 245 125 280 
4 


TYPE ALRT 
Peal 
40 100 180 400 
1, 
VIEL 377 Default = Item 1 / Draw Box / 3 Beeps ( ALL stages ) 
i 0 7 1 / 10 


TYPE DITL 
al 


Button 
100 220 120 270 
OK 


StaticText Disabled 
40 30 60 290 
Numeric Digits Only 


19 


Improved 
Binary Search 
Routine 


Michael P. McLaughlin 


Presented below is a simple binary search of a table of 
longwords, with one modification: If the target value is not 
found, it returns a negative number whose absolute value is 
the position at which the target would have been found had 
it been there. This modification adds to the utility of the 
routine and makes possible some improvements to the 
calling routine. For example, if the target is not present 
you may want to insert it. The binary search makes that 
easier, since you'll already know the insertion point. 


Listing 19.1 


*BINARY SEARCH -- (USES D4-D7,A6) 
#TO SEARCH A SORTED ARRAY OF SIGNED LONGWORDS BEGINNING WITH A DUMMY ENTRY 
#SMALLER THAN ANY POTENTIAL ENTRY, INITIALIZE THE REGISTERS AS FOLLOWS: 


; A6 = BASE: ADDRESS OF ENTIRE ARRAY (LONGWORD) 
; D4 = TARGET LONGWORD 
; D7 = LENGTH OF ARRAY, IN BYTES, LESS DUMMY (LONGWORD) 


#THE SEARCH RETURNS, IN D6, THE DISPLACEMENT (BYTES FROM BASE) 
7OF THE START OF THE TARGET LONGWORD, IF PRESENT, OR -DISPLACEMENT IF THE 
7TARGET IS ABSENT. 


? 


BINSRCH CLR.L D5 7D5 = pointer to bottom 
CLR.L D6 
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Bl 


B2 


FAILURE 


SUCCESS 
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CMP .L 
BMI.S 
MOVE.L 
ADD.L 
LSR.L 
AND.L 
CMP.L 
BEQ.S 
BGT.S 
SUBQ.L 
MOVE.L 
BRA 
ADDQ.L 
MOVE.L 
BRA 
NEG.L 
MOVE .L 


RTS 
END 


D5,D7 
FAILURE 

D7,D6 

D5,D6 

#1,D6 
#OFFFFFFFDH, D6 
0(A6,D6.L) ,D4 
SUCCESS 

B2 

#4,D6 

D6,D7 

Bl 

#4,D6 

D6,D5 

Bl 

D5 

D5,D6 


sbottom > top ? 
vyes, exit 
7else D6 = (D5+D7) div 2 


7back to longword boundary 
sis this it ? 

yes 

#no, target is bigger 
starget is smaller 

stry lower half 


stry upper half 


#return -displacement 
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